📅 2025-08-15 — Session: Analyzed Global Scholarly Publication Statistics

🕒 11:55–12:05
🏷️ Labels: Scholarly Articles, Data Analysis, Python, Data Storage, Publication Statistics
📂 Project: Media
⭐ Priority: MEDIUM

Session Goal

The session aimed to analyze global scholarly publication statistics, focusing on data from various databases and fields to understand trends and distributions.

Key Activities

  • Conducted search queries to gather statistics on global scholarly article publications for the year 2023, including estimates from databases like Scopus and Web of Science.
  • Executed Python scripts to calculate the average yearly count of Science and Engineering (S&E) articles from the 1950s to the 2020s, using an exponential growth model.
  • Implemented code snippets to calculate non-health totals by decade and allocate these totals by discipline.
  • Created Pandas DataFrames from decade allocations and calculated historical totals for pre-1950 and post-1950 periods.
  • Developed functions to estimate storage requirements for different data types and dimensions, and calculated yearly storage needs.

Achievements

  • Successfully gathered and analyzed data on scholarly article publications, providing insights into historical and current trends.
  • Developed a comprehensive set of Python functions and scripts to facilitate data analysis and storage estimation.

Pending Tasks

  • Further analysis is needed to refine projections for future scholarly outputs and their storage implications.
  • Additional exploration of non-health scientific articles and their data indexing requirements is required.