Research Datasets: Difference between revisions
Initial |
fixed formattting Tag: wikieditor |
||
Line 2: | Line 2: | ||
= Datasets = | = Datasets = | ||
'' [[#Finding%20Datasets|Finding Datasets]] - [[#Advanced%20Google%20Query%20for%20Datasets|Advanced Google Query for Datasets]] | ''[[#Finding%20Datasets|Finding Datasets]] - [[#Advanced%20Google%20Query%20for%20Datasets|Advanced Google Query for Datasets]] | ||
'' [[#Known%20Datasets|Known Datasets]] | ''[[#Known%20Datasets|Known Datasets]] | ||
Datasets to use for Research (See [[research|Research wiki]]) | Datasets to use for Research (See [[research|Research wiki]]) | ||
Line 12: | Line 12: | ||
== Finding Datasets == | == Finding Datasets == | ||
'' [https://datasetsearch.research.google.com/ Google Data Set Search] | ''[[https://datasetsearch.research.google.com/ Google Data Set Search]] | ||
'' [https://scholar.google.com/schhp?hl=en Google Scholar]Google Search Power for Academic Writing<br /> | ''[[https://scholar.google.com/schhp?hl=en Google Scholar]] Google Search Power for Academic Writing<br /> | ||
'' [https://www.jstor.org/ JSTOR] digital library of academic journals, books, and primary sources | ''[[https://www.jstor.org/ JSTOR]] digital library of academic journals, books, and primary sources | ||
'' [https://www.researchgate.net/ Research Gate] Massive Database of Academic Journals | ''[[https://www.researchgate.net/ Research Gate]] Massive Database of Academic Journals | ||
'' [https://www.google.com/search?q=site%3A''.edu+%22free%22+%28%22research%22+or+%22dataset%22 Google Dork] for Academic Research Resources | ''[[https://www.google.com/search?q=site%3A''.edu+%22free%22+%28%22research%22+or+%22dataset%22 Google Dork]] for Academic Research Resources | ||
'' [https://elicit.org Elicit] AI journal search. | ''[[https://elicit.org Elicit]] AI journal search. | ||
'' [https://osf.io/ Open Science Framework (OSF)] is a free and open-source platform designed to support research and collaboration across the research life cycle. | ''[[https://osf.io/ Open Science Framework (OSF)]] is a free and open-source platform designed to support research and collaboration across the research life cycle. | ||
'' [https://github.com/awesomedata/awesome-public-datasets Awesome Public Datasets] A curated list of over 40,000 public datasets across various topics. | ''[[https://github.com/awesomedata/awesome-public-datasets Awesome Public Datasets]] A curated list of over 40,000 public datasets across various topics. | ||
'' [https://github.com/hslatman/awesome-threat-intelligence Awesome Cyber Threat Datasets] provide context, mechanisms, indicators, implications, and actionable advice about an existing or emerging menace or hazard to assets that can inform decisions regarding the subject’s response to that menace or hazard_. | ''[[https://github.com/hslatman/awesome-threat-intelligence Awesome Cyber Threat Datasets]] provide context, mechanisms, indicators, implications, and actionable advice about an existing or emerging menace or hazard to assets that can inform decisions regarding the subject’s response to that menace or hazard_. | ||
'' [https://www.reddit.com/r/datasets/ r/datasets] | ''[[https://www.reddit.com/r/datasets/ r/datasets]] | ||
'' [https://www.reddit.com/r/Dissertation/ r/Dissertation] | ''[[https://www.reddit.com/r/Dissertation/ r/Dissertation]] | ||
'' [https://www.reddit.com/r/AskAcademia/ r/AskAcademia] | ''[[https://www.reddit.com/r/AskAcademia/ r/AskAcademia]] | ||
'' [https://www.reddit.com/r/GradSchool/ r/GradSchool] | ''[[https://www.reddit.com/r/GradSchool/ r/GradSchool]] | ||
<span id="queries-for-datasets"></span> | <span id="queries-for-datasets"></span> | ||
Line 31: | Line 31: | ||
<pre class="copy-search">"Search_TERM_HERE" site:vision.in.tum.de OR site:www.cdbb.cam.ac.uk OR site:bimportal.scottishfuturestrust.org.uk OR site:digicatapult.org.uk OR site:pewresearch.org OR site:odsc.com OR site:archive.ics.uci.edu OR site:research.tudelft.nl OR site:archive.data.jhu.edu OR site:systems.jhu.edu</pre> | <pre class="copy-search">"Search_TERM_HERE" site:vision.in.tum.de OR site:www.cdbb.cam.ac.uk OR site:bimportal.scottishfuturestrust.org.uk OR site:digicatapult.org.uk OR site:pewresearch.org OR site:odsc.com OR site:archive.ics.uci.edu OR site:research.tudelft.nl OR site:archive.data.jhu.edu OR site:systems.jhu.edu</pre> | ||
<span id="case-studies-and-projects-using-datasets"></span> | <span id="case-studies-and-projects-using-datasets"></span> | ||
=== Case Studies and Projects Using Datasets === | === Case Studies and Projects Using Datasets === | ||
'' [https://github.com/vizdata-f21/project-2-tidy_team?tab=readme-ov-file 2 Tidy] interactive spatio-temporal visualization of worldwide deaths related to various risk factors, specifically air pollution, substance use, and lack of sanitation. | ''[[https://github.com/vizdata-f21/project-2-tidy_team?tab=readme-ov-file 2 Tidy]]'' interactive spatio-temporal visualization of worldwide deaths related to various risk factors, specifically air pollution, substance use, and lack of sanitation. | ||
== Known Datasets == | |||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
! style="text-align: left;"| URL | ! style="text-align: left;"| URL | ||
! | ! Comments | ||
! style="text-align: right;"| Free (Y/N) | ! style="text-align: right;"| Free (Y/N) | ||
! Category | ! Category | ||
! Region | ! Region | ||
|- | |- | ||
| style="text-align: left;"| [https://library.si.edu/research/free-databases-and-collections Smithsonian Library Resources] | | style="text-align: left;"| [[https://library.si.edu/research/free-databases-and-collections Smithsonian Library Resources]] | ||
| This list includes databases, collections and search tools, selected by Smithsonian Libraries staff, that are freely available via the Internet. | | This list includes databases, collections and search tools, selected by Smithsonian Libraries staff, that are freely available via the Internet. | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| Academic | | Academic | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| [http://cross-sub.org/ CrossSub]<br> [https://yz-data.shinyapps.io/xsub/ Alt Link] | | style="text-align: left;"| [[http://cross-sub.org/ CrossSub]]<br>[[https://yz-data.shinyapps.io/xsub/ Alt Link]] | ||
| micro-level, subnational event data on armed conflict and contention around the | | micro-level, subnational event data on armed conflict and contention around the world | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| Conflict | | Conflict | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| [https://acleddata.com/#/dashboard ACLE] | | style="text-align: left;"| [[https://acleddata.com/#/dashboard ACLE]] | ||
| real-time data on the locations, dates, actors, fatalities, and types of all reported political violence and protest events around the world. | | real-time data on the locations, dates, actors, fatalities, and types of all reported political violence and protest events around the world. | ||
| style="text-align: right;"| N | | style="text-align: right;"| N | ||
Line 62: | Line 65: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| [https://osmp.airwars.org OSMP] | | style="text-align: left;"| [[https://osmp.airwars.org OSMP]] | ||
| Open Source Munitions Portal (OSMP) A new open-source portal was just launched today by Airwars and Arms Research. Incredibly useful database, particularly for anyone covering armed conflicts or wars. | | Open Source Munitions Portal (OSMP) A new open-source portal was just launched today by Airwars and Arms Research. Incredibly useful database, particularly for anyone covering armed conflicts or wars. | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 68: | Line 71: | ||
| N/A | | N/A | ||
|- | |- | ||
| style="text-align: left;"| [https://liveuamap.com/ LiveUA] | | style="text-align: left;"| [[https://liveuamap.com/ LiveUA]] | ||
| factual reporting of a variety of important topics including conflicts, human rights issues, protests, terrorism, weapons deployment, health matters, natural disasters, and weather related stories, among others, from a vast array of sources | | factual reporting of a variety of important topics including conflicts, human rights issues, protests, terrorism, weapons deployment, health matters, natural disasters, and weather related stories, among others, from a vast array of sources | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 74: | Line 77: | ||
| Ukraine | | Ukraine | ||
|- | |- | ||
| style="text-align: left;"| [https://observatoriodeviolencia.org.ve/news/annual-report-violence-2023/ Venezuelan Violence Data] | | style="text-align: left;"| [[https://observatoriodeviolencia.org.ve/news/annual-report-violence-2023/ Venezuelan Violence Data]] | ||
| | | | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 80: | Line 83: | ||
| LATAM - Mexico | | LATAM - Mexico | ||
|- | |- | ||
| style="text-align: left;"| [https://www.google.com/search?q=inurl%3Ahttps%3A%2F%2Fsimplemaps.com%2Fdata%2F' | | style="text-align: left;"| [[https://www.google.com/search?q=inurl%3Ahttps%3A%2F%2Fsimplemaps.com%2Fdata%2F' World City Database]] | ||
| Database of cities with information of population and general Lat | | Database of cities with information of population and general Lat Long | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| Country Data | | Country Data | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| [https://tradingeconomics.com/indicators TradingEconomics] | | style="text-align: left;"| [[https://tradingeconomics.com/indicators TradingEconomics]] | ||
| Mass database of metrics and indicators by country over time | | Mass database of metrics and indicators by country over time | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 92: | Line 95: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| [https://gitnux.org/topics/statistics/crime-and-safety-statistics/ GitNux Crime Reports] | | style="text-align: left;"| [[https://gitnux.org/topics/statistics/crime-and-safety-statistics/ GitNux Crime Reports]] | ||
| Crime reports and stats | | Crime reports and stats | ||
| style="text-align: right;"| | | style="text-align: right;"| | ||
Line 98: | Line 101: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| [https://radar.cloudflare.com/ Cloudflare Radar] | | style="text-align: left;"| [[https://radar.cloudflare.com/ Cloudflare Radar]] | ||
| A view of outages, threats, rankings and more based on the massive amount of cloudflare data | | A view of outages, threats, rankings and more based on the massive amount of cloudflare data | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 104: | Line 107: | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| [https://vision.in.tum.de/data TUM Data] | | style="text-align: left;"| [[https://vision.in.tum.de/data TUM Data]] | ||
| Large collection of data sets for computer vision research | | Large collection of data sets for computer vision research | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 110: | Line 113: | ||
| | | | ||
|- | |- | ||
| style="text-align: left;"| [https://repositorio.cepal.org/server/api/core/bitstreams/2db8feef-29d6-4981-9741-9ad3154d3789/content CEPAL Cyber Attacks] | | style="text-align: left;"| [[https://repositorio.cepal.org/server/api/core/bitstreams/2db8feef-29d6-4981-9741-9ad3154d3789/content CEPAL Cyber Attacks]] | ||
| Cyber Attacks in LATAM | | Cyber Attacks in LATAM | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
Line 116: | Line 119: | ||
| LATAM | | LATAM | ||
|- | |- | ||
| style="text-align: left;"| [https://www.oecd.org/investment/statistics.htm OECD] | | style="text-align: left;"| [[https://www.oecd.org/investment/statistics.htm OECD]] | ||
| Foreign Direct Investment (FDI) Statistics | | Foreign Direct Investment (FDI) Statistics | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| Finance & | | Finance & Business | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| [https://data.worldbank.org/ World Bank Data] | | style="text-align: left;"| [[https://data.worldbank.org/ World Bank Data]] | ||
| | | Economic Datasets | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| Finance & | | Finance & Business | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| [https://bimportal.scottishfuturestrust.org.uk/page/roi-calculator Scottish Futures Trust ROI Calculator] | | style="text-align: left;"| [[https://bimportal.scottishfuturestrust.org.uk/page/roi-calculator Scottish Futures Trust ROI Calculator]] | ||
| Calculator that allows the user to calculate the expected return on investment of a building project | | Calculator that allows the user to calculate the expected return on investment of a building project | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| Finance & | | Finance & Business | ||
| | | | ||
|- | |- | ||
| style="text-align: left;"| [https://www.numbeo.com/cost-of-living/ Numbeo] | | style="text-align: left;"| [[https://www.numbeo.com/cost-of-living/ Numbeo]] | ||
| cost of living calculator and comparison tool. Useful for determining the average price around the world. | | cost of living calculator and comparison tool. Useful for determining the average price around the world. | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| Finance & | | Finance & Business | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| [https://ai.reportlinker.com/pricing Reportlinker] | | style="text-align: left;"| [[https://ai.reportlinker.com/pricing Reportlinker]] | ||
| AI enabled Market Intelligence Platform | | AI enabled Market Intelligence Platform | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| Finance & | | Finance & Business | ||
| | | Global | ||
|- | |- | ||
| style="text-align: left;"| [https:// | | style="text-align: left;"| [[https://www.kaggle.com/datasets Kaggle]] | ||
| | | Data repository with many datasets for competitions | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| | | Machine Learning | ||
| | | Global | ||
|- | |- | ||
| style="text-align: left;"| [https://www. | | style="text-align: left;"| [[https://www.youtube.com/ YouTube]] | ||
| | | Video Searchable Database of Machine Learning Videos | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| | | Machine Learning | ||
| | | Global | ||
|- | |- | ||
| style="text-align: left;"| [https://www. | | style="text-align: left;"| [[https://www.amazon.com/AWS/Amazon-S3 Amazon S3]] | ||
| | | Various datasets provided by Amazon Web Services | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| | | Data Storage | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| [https:// | | style="text-align: left;"| [[https://r-nd.ami.btu.de/ AMI Data Set]] | ||
| | | This dataset comprises data from a wide range of sources including the finance sector. | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| | | Data Storage | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| [https:// | | style="text-align: left;"| [[https://bostondata.org/ Boston Data]] | ||
| | | Boston city datasets | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| | | City Data | ||
| | | Boston | ||
|- | |- | ||
| style="text-align: left;"| [https:// | | style="text-align: left;"| [[https://www.socialeurope.eu/ Social Europe]] | ||
| Social Data | |||
| | |||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| | | Social | ||
| Global | | Global | ||
|- | |- | ||
| style="text-align: left;"| [https://www. | | style="text-align: left;"| [[https://www.census.gov/ US Census]] | ||
| | | Census Data | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| | | Demographics | ||
| USA | | USA | ||
|- | |- | ||
| style="text-align: left;"| [https:// | | style="text-align: left;"| [[https://data.gov/ US Government Data]] | ||
| | | Government Data | ||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| | | Government | ||
| USA | | USA | ||
|- | |- | ||
| style="text-align: left;"| [https:// | | style="text-align: left;"| [[https://www.data.gov.uk/ UK Government Data]] | ||
| Government Data | |||
| | |||
| style="text-align: right;"| Y | | style="text-align: right;"| Y | ||
| | | Government | ||
| | | UK | ||
|} | |} | ||
= Research Data Datasets = | |||
{{Special:FormEdit/ResearchData}} |
Revision as of 00:49, 12 September 2024
Datasets
Finding Datasets - Advanced Google Query for Datasets Known Datasets
Datasets to use for Research (See Research wiki)
Find additional datasets behind a login in the community dataset section
Finding Datasets
[Google Data Set Search]
[Google Scholar] Google Search Power for Academic Writing
[JSTOR] digital library of academic journals, books, and primary sources [Research Gate] Massive Database of Academic Journals [.edu+%22free%22+%28%22research%22+or+%22dataset%22 Google Dork] for Academic Research Resources [Elicit] AI journal search. [Open Science Framework (OSF)] is a free and open-source platform designed to support research and collaboration across the research life cycle. [Awesome Public Datasets] A curated list of over 40,000 public datasets across various topics. [Awesome Cyber Threat Datasets] provide context, mechanisms, indicators, implications, and actionable advice about an existing or emerging menace or hazard to assets that can inform decisions regarding the subject’s response to that menace or hazard_. [r/datasets] [r/Dissertation] [r/AskAcademia] [r/GradSchool]
Queries for Datasets
"Search_TERM_HERE" site:vision.in.tum.de OR site:www.cdbb.cam.ac.uk OR site:bimportal.scottishfuturestrust.org.uk OR site:digicatapult.org.uk OR site:pewresearch.org OR site:odsc.com OR site:archive.ics.uci.edu OR site:research.tudelft.nl OR site:archive.data.jhu.edu OR site:systems.jhu.edu
Case Studies and Projects Using Datasets
[2 Tidy] interactive spatio-temporal visualization of worldwide deaths related to various risk factors, specifically air pollution, substance use, and lack of sanitation.
Known Datasets
URL | Comments | Free (Y/N) | Category | Region |
---|---|---|---|---|
[Smithsonian Library Resources] | This list includes databases, collections and search tools, selected by Smithsonian Libraries staff, that are freely available via the Internet. | Y | Academic | Global |
[CrossSub] [Alt Link] |
micro-level, subnational event data on armed conflict and contention around the world | Y | Conflict | Global |
[ACLE] | real-time data on the locations, dates, actors, fatalities, and types of all reported political violence and protest events around the world. | N | Conflict | Global |
[OSMP] | Open Source Munitions Portal (OSMP) A new open-source portal was just launched today by Airwars and Arms Research. Incredibly useful database, particularly for anyone covering armed conflicts or wars. | Y | Conflict | N/A |
[LiveUA] | factual reporting of a variety of important topics including conflicts, human rights issues, protests, terrorism, weapons deployment, health matters, natural disasters, and weather related stories, among others, from a vast array of sources | Y | Conflict | Ukraine |
[Venezuelan Violence Data] | Y | Conflict | LATAM - Mexico | |
[World City Database] | Database of cities with information of population and general Lat Long | Y | Country Data | Global |
[TradingEconomics] | Mass database of metrics and indicators by country over time | Y | Country Data | Global |
[GitNux Crime Reports] | Crime reports and stats | Crime | Global | |
[Cloudflare Radar] | A view of outages, threats, rankings and more based on the massive amount of cloudflare data | Y | Cyber | Global |
[TUM Data] | Large collection of data sets for computer vision research | Y | Cyber | |
[CEPAL Cyber Attacks] | Cyber Attacks in LATAM | Y | Cyber | LATAM |
[OECD] | Foreign Direct Investment (FDI) Statistics | Y | Finance & Business | Global |
[World Bank Data] | Economic Datasets | Y | Finance & Business | Global |
[Scottish Futures Trust ROI Calculator] | Calculator that allows the user to calculate the expected return on investment of a building project | Y | Finance & Business | |
[Numbeo] | cost of living calculator and comparison tool. Useful for determining the average price around the world. | Y | Finance & Business | Global |
[Reportlinker] | AI enabled Market Intelligence Platform | Y | Finance & Business | Global |
[Kaggle] | Data repository with many datasets for competitions | Y | Machine Learning | Global |
[YouTube] | Video Searchable Database of Machine Learning Videos | Y | Machine Learning | Global |
[Amazon S3] | Various datasets provided by Amazon Web Services | Y | Data Storage | Global |
[AMI Data Set] | This dataset comprises data from a wide range of sources including the finance sector. | Y | Data Storage | Global |
[Boston Data] | Boston city datasets | Y | City Data | Boston |
[Social Europe] | Social Data | Y | Social | Global |
[US Census] | Census Data | Y | Demographics | USA |
[US Government Data] | Government Data | Y | Government | USA |
[UK Government Data] | Government Data | Y | Government | UK |