📅 2025-08-15 — Session: Analyzed Global Scholarly Publication Statistics
🕒 11:55–12:05
🏷️ Labels: Scholarly Articles, Data Analysis, Python, Data Storage, Publication Statistics
📂 Project: Media
⭐ Priority: MEDIUM
Session Goal
The session aimed to analyze global scholarly publication statistics, focusing on data from various databases and fields to understand trends and distributions.
Key Activities
- Conducted search queries to gather statistics on global scholarly article publications for the year 2023, including estimates from databases like Scopus and Web of Science.
- Executed Python scripts to calculate the average yearly count of Science and Engineering (S&E) articles from the 1950s to the 2020s, using an exponential growth model.
- Implemented code snippets to calculate non-health totals by decade and allocate these totals by discipline.
- Created Pandas DataFrames from decade allocations and calculated historical totals for pre-1950 and post-1950 periods.
- Developed functions to estimate storage requirements for different data types and dimensions, and calculated yearly storage needs.
Achievements
- Successfully gathered and analyzed data on scholarly article publications, providing insights into historical and current trends.
- Developed a comprehensive set of Python functions and scripts to facilitate data analysis and storage estimation.
Pending Tasks
- Further analysis is needed to refine projections for future scholarly outputs and their storage implications.
- Additional exploration of non-health scientific articles and their data indexing requirements is required.