Social Sciences and Big Data
Platforms and Challenges
DOI:
https://doi.org/10.37467/revtechno.v12.3383Keywords:
Big data, Humanities, Platforms, Research, Repositories, Social SciencesAbstract
The objective of this research was to explore and characterize the main big data repositories in the area of social sciences available in 2021. The research design was non-experimental, exploratory and descriptive. The population consisted of 110 big data located by the Google dataset search engine. The sample corresponded to the top 10 big data. The results indicated that the most important big data repositories and platforms are centralized by the private sector located in US companies, fundamentally.
References
American Marketing Association (2022, 12 de febrero). 2020 Top 50 U.S. Market Research and Data Analytics Companies. https://www.ama.org/marketing-news/2020-top-50/
Angus, R. (2019). Problemistic Search Distance and Entrepreneurial Performance. Strategic Management Journal, 40(12), 2011-2023. https://bit.ly/3ZHhGng
Angwin, J., Larson, J., Mattu, S. & Kirchner, L. (2016). Machine Bias. ProPublica. https://goo.gl/8MAfhK
Antons, D. & Breidbach, C. (2017). Big data, Big Insights? Advancing Service Innovation and Design with Machine Learning. Journal of Service Research, 21(1), 17-39. https://bit.ly/3k4Hedr
Antons, D., Joshi, A. & Salge, T. (2018). Content, Contribution, and Knowledge Consumption: Uncovering Hidden Topic Structure and Rhetorical Signals in Scientific Texts. Journal of Management, 45(7). 3035-3076. https://bit.ly/3XrgcM0
Banco Mundial (2021, 12 de febrero). Informe Annual 2016. https://bit.ly/3WXWtDH
Boullier, D. (2016). Big data challenges for the social sciences: from society and opinion to replications. Cornel University. https://arxiv.org/abs/1607.05034
Boyd, D., & Crawford, K. (2012). CRITICAL QUESTIONS FOR BIG DATA. Information, Communication & Society, 15(5), 662–679. https://doi.org/10.1080/1369118X.2012.678878
Canada Goberment (2021). Open Data for Development. https://www.od4d.net/
Cioffi-Revilla, C. (2010). Computational social science. Wiley Interdisciplinary Reviews: Computational Statistics, 2(3), 259–271. https://doi.org/10.1002/wics.95
Connelly, R., Playford, C., Gayle, V., & Dibben, C. (2016). The role of administrative data in the big data revolution in social science research. Social Science Research, 59, 1–12. https://doi.org/10.1016/j.ssresearch.2016.04.015
Chen, E., & Wojcik, S. (2016). Supplemental Material for A Practical Guide to Big data Research in Psychology. Psychological Methods, 21(4), 458–474. https://bit.ly/3QOJkL7
Data Portal (2021). A Comprehensive List of Open Data Portals from Around the World. Data Portal. https://dataportals.org/search
Demchenko, Y., Grosso, P., de Laat, C., & Membrey, P. (2013). Addressing big data issues in Scientific Data Infrastructure. 2013 International Conference on Collaboration Technologies and Systems (CTS), 48–55. https://doi.org/10.1109/CTS.2013.6567203
Diebold, F. (2012). On the Origin(s) and Development of the Term “Big data”. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.2152421
Digital Guide (2021, 12 de febrero). Application Programming Interface (API): cómo se comunican las aplicaciones. Digital Guide. https://bit.ly/3w49Q9n
Elite Data Sciences. (2021, 10 de febrero). Datasets for Data Science and Machine Learning. Data Sets. Elite Data Sciences. https://elitedatascience.com/datasets
Espinosa, J. (2020). Aplicación de metodología CRISP-DM para segmentación geográfica de una base de datos pública. Ingeniería, investigación y tecnología, 21(1). https://bit.ly/3ka9RpV
Eynon, R. (2013). The rise of Big data: what does it mean for education, technology, and media research? Learning, Media and Technology, 38(3), 237–240. https://bit.ly/3CHvMLk
Gartner (2021). Gartner Glosary. Gartner. https://gtnr.it/3CH6MUB
George, G., Osinga, E., Lavie, D. & Scott, B. (2016). Big data and Data Science Methods for Management Research. Academy of Management Journal, 59(5), 1493-1507. https://bit.ly/3IJtorl
Hong Kong Baptiste University & Library. (2021) Data across countries. https://bit.ly/3XkI9EU
Huber, S., Wiemer, H., Schneider, D. & Ihlenfeldt, S. (2019). DMME: Data mining methodology for engineering applications – a holistic extension to the CRISP-DM model. Procedia CIRP, 79, 403-408. https://www.sciencedirect.com/science/article/pii/S2212827119302239
Humphreys, A. & Wang, R. (2017). Automated Text Analysis for Consumer Research. Journal of Consumer Research, 44(6), 1274-1306. https://bit.ly/3X6TEAk
ICPSR Sharing data to advance Science (2021). Home. https://www.icpsr.umich.edu/web/pages/
Ingersoll, G., Morton, T. & Farris, A. (2013). Taming text: How find, organize, and manipulate it. Manning Publications Co.
Insights Association (2020). Research & data analytics industry. Top 50 Report US, 2020.
Kaggle (2021). Datasets. https://www.kaggle.com/datasets
Kaisler, S., Armour, F., Espinosa, J., & Money, W. (2013). Big data: Issues and Challenges Moving Forward. 2013 46th Hawaii International Conference on System Sciences, 995–1004. https://doi.org/10.1109/HICSS.2013.645
Kilroy, J. (2021). 100+ of the Best Free Data Sources for Your Next Project. Colum Five. https://www.columnfivemedia.com/100-best-free-data-sources-infographic
Kobayashi, V., Mol, S., Berkers, H., Kismihók, G. & Den Hartog, D. (2017). Text Classification for Organizational Researchers. Organizational Research Methods, 21(3), 766-799. https://journals.sagepub.com/doi/10.1177/1094428117719322
Kosinski, M., Matz, S., Gosling, S., Popov, V., & Stillwell, D. (2015). Facebook as a research tool for the social sciences: Opportunities, challenges, ethical considerations, and practical guidelines. American Psychologist, 70(6), 543–556. https://doi.org/10.1037/a0039210
Laney, D. (2001). 3-D Data Management: Controlling Data Volume, Velocity and Variety. META Group Research Note. Scientific Research. https://bit.ly/3ZxMVAO
Lee, J., Kim, C. & Shin, J. (2017). Technology opportunity discovery to R&D planning: Key technological performance analysis. Technological Forecasting and Social Change, Elsevier, 119(C), 53-6. https://www.sciencedirect.com/science/article/abs/pii/S0040162516305893
Leonelli, S. y Carrigan, M. (2015). Sabina Leonelli: What constitutes trustworthy data changes across time and space. lse: Impact of Social Sciences Blog. https://bit.ly/3GCChQG
Lopes, C. & Bailur, S. (2018). Gender Equality and Big data. UN Women. https://bit.ly/3iD1Toz
Mahmoodi, J., Leckelt, M., van Zalk, M., Geukes, K., & Back, M. (2017). Big data approaches in social and behavioral science: four key trade-offs and a call for integration. Current Opinion in Behavioral Sciences, 18(59), 57–62. https://doi.org/10.1016/j.cobeha.2017.07.001
Martínez, F., Contreras, L., Ferri, C., Hernández, J., Kull, M., Lachiche, N. & Flach, P. (2019). CRISP-DM Twenty Years Later: From Data Mining Processes to Data Science Trajectories. IEEE Transactions on Knowledge and Data Engineering, 1(1). https://bit.ly/3kaaiR5
Masley, J. (1998). Big data and the Next Wave of InfraStress. Computer Systems Laboratory Colloquium February 25, Silicon Valley. https://web.stanford.edu/class/ee380/9798win/lect08.html
Mayer, V. & Kenneth, C. (2014). Big data: A Revolution that will Transform how we Live, Work, and Think. Houghton Mifflin Harcourt.
Meneses, M. (2018). Grandes datos, grandes desafíos para las ciencias sociales. Revista Mexicana de Sociología, 80(2), 415-444.
Metcalf, J. & Crawford, K. (2016). Where are human subjects in Big data research? The emerging ethics divide. Big data & Society, 3(1), 1-14. https://doi.org/10.1177/2053951716650211
Moehrle, M., Wustmans, M. & Gerken, J. (2017). How business methods accompany technological innovations - a case study using semantic patent analysis and a novel informetric measure. R&D Management, 48(3), 331–342. https://onlinelibrary.wiley.com/doi/10.1111/radm.12307
Nambisan, S., Lyytinen, K., Majchrzak, A. & Song, M. (2017). Digital innovation management: reinventing innovation management research in a digital worl. MIS Quartely, 41(1), 223-238. https://bit.ly/3XcUBqY
Nature (2021). Scientific Data. Nature. https://www.nature.com/sdata/policies/repositories
Open Data Institute (2021). We want a world where data works for everyone. https://theodi.org/
Oussous, A., Benjelloun, F.-Z., Ait Lahcen, A. & Belfkih, S. (2017). Big data technologies: A survey. Journal of King Saud University-Computer and Information Sciences, 30(4), 431-448. https://www.sciencedirect.com/science/article/pii/S1319157817300034
Paterson, M. & Mc Donagh, M. (2018). Data Protection in an era of Big data: The challenges posed by big personal data. Monash University Law Review, 44(1), 1-31. https://bit.ly/3k96jUI
Pennebaker, J., Boyd, R., Jordan, K. y Blackburn, K. (2015). The Development and Psychometric Properties of LIWC2015. Texas University. https://bit.ly/3H16Tgq
Pew Research Center (2021). Download Datasets. https://www.pewresearch.org/download-datasets/
Portillo, J. (2016). Planos de realidad, identidad virtual y discurso en las redes sociales. Logos (La Serena), 26(1), 51-63. http://dx.doi.org/10.15443/RL2604
Pyle, D. (2003). Business Modeling and Data Mining. Morgan Kaufmann Publishers.
Quercia, D., Kosinski, M., Stillwell, D., & Crowcroft, J. (2011). Our Twitter Profiles, Our Selves: Predicting Personality with Twitter. 2011 IEEE Third Int’l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int’l Conference on Social Computing, 180–185. https://doi.org/10.1109/PASSAT/SocialCom.2011.26
Raconteur (2021). Content for business decision-makers. https://www.raconteur.net/
Rana, A. (2020). Leveraging Big data to Advance Gender Equality. EMCompass (86). https://openknowledge.worldbank.org/handle/10986/34308
SAS Enterprise Miner. (2021). Reveal valuable insights with powerful data mining software. SAS Enterprise Miner. https://www.sas.com/en_th/software/enterprise-miner.html
SAS Institute. (1998). Data Mining and the Case for Sampling. https://bit.ly/3XsDi4W
Seminario-Córdova, R., & Paredes-Gutiérrez, P. (2021). Principales factores influyentes en el incremento de casos de violencia contra la mujer en Perú: contexto pandémico. Social Innova Sciences, 2(3), 17–35. https://socialinnovasciences.org/ojs/index.php/sis/article/view/61/74
Sheldon, P. & Bryant, K. (2016). Instagram: Motives for its use and relationship to narcissism and contextual age. Computers in Human Behavior, 58, 89-97. https://bit.ly/3k5RgLv
Snijders, C., Matzat, U., & Reips, U.-D. (2012). Structural color and microstructure of ligament in bivalve shells of Cyclina sinesis. International Journal of Internet Science, 7(1), 1–5. https://bit.ly/3XlRrAI
UN Data (2021). A World of information. UN Data. https://data.un.org/
UN Global Pulse (2021). Big data and Artificial Intelligence. https://www.unglobalpulse.org/
Ureña, R. (2019). Autoridad algorítmica: ¿cómo empezar a pensar la protección de los derechos humanos en la era del “big data”? Latin American Law Review, 2, 99-124. https://doi.org/10.29263/lar02.2019.05
Web World Wide Foundation (2021). Open Data Barometer. https://opendatabarometer.org/
Downloads
Published
How to Cite
Issue
Section
License
Those authors who publish in this journal accept the following terms:
- Authors will keep the moral right of the work and they will transfer the commercial rights.
- After 1 year from publication, the work shall thereafter be open access online on our website, but will retain copyright.
- In the event that the authors wish to assign an Creative Commons (CC) license, they may request it by writing to publishing@eagora.org