Research Datasets: Difference between revisions

From Irregularpedia
Jump to navigation Jump to search
category
Tag: wikieditor
 
(9 intermediate revisions by 2 users not shown)
Line 1: Line 1:
<span id="datasets"></span>
<span id="datasets"></span>
= Datasets =
= Datasets =
''[[#Finding%20Datasets|Finding Datasets]] - [[#Advanced%20Google%20Query%20for%20Datasets|Advanced Google Query for Datasets]]
''[[#Known%20Datasets|Known Datasets]]
Datasets to use for Research (See [[research|Research wiki]])
Find additional datasets behind a login in the [[datasets|community dataset]] section


<span id="finding-datasets"></span>
<span id="finding-datasets"></span>
== Finding Datasets ==
== Finding Datasets ==


''[[https://datasetsearch.research.google.com/ Google Data Set Search]]
* [https://datasetsearch.research.google.com/ Google Data Set Search]
 
* [https://scholar.google.com/schhp?hl=en Google Scholar] Google Search Power for Academic Writing
''[[https://scholar.google.com/schhp?hl=en Google Scholar]] Google Search Power for Academic Writing''
* [https://www.jstor.org/ JSTOR] digital library of academic journals, books, and primary sources
 
* [https://www.researchgate.net/ Research Gate] Massive Database of Academic Journals
''[[https://www.jstor.org/ JSTOR]] digital library of academic journals, books, and primary sources''
* [https://www.google.com/search?q=site%3A.edu+%22free%22+%28%22research%22+or+%22dataset%22 Google Dork] for Academic Research Resources
 
* [https://elicit.org Elicit] AI journal search.
''[[https://www.researchgate.net/ Research Gate]] Massive Database of Academic Journals''
* [https://osf.io/ Open Science Framework (OSF)] is a free and open-source platform that supports research and collaboration across the research life cycle.
 
* [https://github.com/awesomedata/awesome-public-datasets Awesome Public Datasets] A curated list of over 40,000 public datasets across various topics.
''[<nowiki/>[[google:site:|''.edu+%22free%22+%28%22research%22+or+%22dataset%22 Google Dork'']]] for Academic Research Resources''
* [https://github.com/hslatman/awesome-threat-intelligence Awesome Cyber Threat Datasets] provide context, mechanisms, indicators, implications, and actionable advice about existing or emerging threats to assets that can inform decisions regarding the subject's response to that menace or hazard.
 
# [https://forum.irregularchat.com/tag/dataset DataSet Tag] on IrregularChat Forum
''[[https://elicit.org Elicit]] AI journal search.''
* [https://www.reddit.com/r/datasets/ r/datasets]
 
* [https://www.reddit.com/r/Dissertation/ r/Dissertation]
''[[https://osf.io/ Open Science Framework (OSF)]] is a free and open-source platform designed to support research and collaboration across the research life cycle.''
* [https://www.reddit.com/r/AskAcademia/ r/AskAcademia]
 
* [https://www.reddit.com/r/GradSchool/ r/GradSchool]
''[[https://github.com/awesomedata/awesome-public-datasets Awesome Public Datasets]] A curated list of over 40,000 public datasets across various topics.''
 
''[[https://github.com/hslatman/awesome-threat-intelligence Awesome Cyber Threat Datasets]] provide context, mechanisms, indicators, implications, and actionable advice about an existing or emerging menace or hazard to assets that can inform decisions regarding the subject’s response to that menace or hazard_.''
 
''[[https://www.reddit.com/r/datasets/ r/datasets]]''
 
''[[https://www.reddit.com/r/Dissertation/ r/Dissertation]]''
 
''[[https://www.reddit.com/r/AskAcademia/ r/AskAcademia]]''
 
''[[https://www.reddit.com/r/GradSchool/ r/GradSchool]]''


<span id="queries-for-datasets"></span>
<span id="queries-for-datasets"></span>
=== Queries for Datasets ===
=== Queries for Datasets ===


<pre class="copy-search">&quot;Search_TERM_HERE&quot; site:vision.in.tum.de OR site:www.cdbb.cam.ac.uk OR site:bimportal.scottishfuturestrust.org.uk OR site:digicatapult.org.uk OR site:pewresearch.org OR site:odsc.com OR site:archive.ics.uci.edu OR site:research.tudelft.nl OR site:archive.data.jhu.edu OR site:systems.jhu.edu</pre>
<pre class="copy-search">"Search_TERM_HERE" site:vision.in.tum.de OR site:www.cdbb.cam.ac.uk OR site:bimportal.scottishfuturestrust.org.uk OR site:digicatapult.org.uk OR site:pewresearch.org OR site:odsc.com OR site:archive.ics.uci.edu OR site:research.tudelft.nl OR site:archive.data.jhu.edu OR site:systems.jhu.edu</pre>


<span id="case-studies-and-projects-using-datasets"></span>
<span id="case-studies-and-projects-using-datasets"></span>
=== Case Studies and Projects Using Datasets ===
=== Case Studies and Projects Using Datasets ===


''[[https://github.com/vizdata-f21/project-2-tidy_team?tab=readme-ov-file 2 Tidy]]'' interactive spatio-temporal visualization of worldwide deaths related to various risk factors, specifically air pollution, substance use, and lack of sanitation.
* [https://github.com/vizdata-f21/project-2-tidy_team?tab=readme-ov-file Global Deaths by Risk Factors Github Repo] interactive spatio-temporal visualization of worldwide deaths related to various risk factors, specifically air pollution, substance use, and lack of sanitation.


== Known Datasets ==
== Known Datasets ==


{| class="wikitable"
{| class="wikitable sortable"
|-
|-
! style="text-align: left;"| URL
! style="text-align: left;"| URL
Line 58: Line 40:
! Region
! Region
|-
|-
| style="text-align: left;"| [[https://library.si.edu/research/free-databases-and-collections Smithsonian Library Resources]]
| style="text-align: left;"| [https://library.si.edu/research/free-databases-and-collections Smithsonian Library Resources]
| This list includes databases, collections and search tools, selected by Smithsonian Libraries staff, that are freely available via the Internet.
| This list includes databases, collections and search tools, selected by Smithsonian Libraries staff, that are freely available via the Internet.
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 64: Line 46:
| Global
| Global
|-
|-
| style="text-align: left;"| [[http://cross-sub.org/ CrossSub]]<br>[[https://yz-data.shinyapps.io/xsub/ Alt Link]]
| style="text-align: left;"| [http://cross-sub.org/ CrossSub]<br>[https://yz-data.shinyapps.io/xsub/ Alt Link]
| micro-level, subnational event data on armed conflict and contention around the world
| micro-level, subnational event data on armed conflict and contention around the world
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 70: Line 52:
| Global
| Global
|-
|-
| style="text-align: left;"| [[https://acleddata.com/#/dashboard ACLE]]
| style="text-align: left;"| [https://acleddata.com/#/dashboard ACLE]
| real-time data on the locations, dates, actors, fatalities, and types of all reported political violence and protest events around the world.
| real-time data on the locations, dates, actors, fatalities, and types of all reported political violence and protest events around the world.
| style="text-align: right;"| N
| style="text-align: right;"| N
Line 76: Line 58:
| Global
| Global
|-
|-
| style="text-align: left;"| [[https://osmp.airwars.org OSMP]]
| style="text-align: left;"| [https://osmp.airwars.org OSMP]
| Open Source Munitions Portal (OSMP) A new open-source portal was just launched today by Airwars and Arms Research. Incredibly useful database, particularly for anyone covering armed conflicts or wars.
| Open Source Munitions Portal (OSMP) A new open-source portal was just launched today by Airwars and Arms Research. Incredibly useful database, particularly for anyone covering armed conflicts or wars.
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 82: Line 64:
| N/A
| N/A
|-
|-
| style="text-align: left;"| [[https://liveuamap.com/ LiveUA]]
| style="text-align: left;"| [https://liveuamap.com/ LiveUA]
| factual reporting of a variety of important topics including conflicts, human rights issues, protests, terrorism, weapons deployment, health matters, natural disasters, and weather related stories, among others, from a vast array of sources
| factual reporting of a variety of important topics including conflicts, human rights issues, protests, terrorism, weapons deployment, health matters, natural disasters, and weather related stories, among others, from a vast array of sources
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 88: Line 70:
| Ukraine
| Ukraine
|-
|-
| style="text-align: left;"| [[https://observatoriodeviolencia.org.ve/news/annual-report-violence-2023/ Venezuelan Violence Data]]
| style="text-align: left;"| [https://observatoriodeviolencia.org.ve/news/annual-report-violence-2023/ Venezuelan Violence Data]
|
|
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 94: Line 76:
| LATAM - Mexico
| LATAM - Mexico
|-
|-
| style="text-align: left;"| [[https://www.google.com/search?q=inurl%3Ahttps%3A%2F%2Fsimplemaps.com%2Fdata%2F' World City Database]]
| style="text-align: left;"| [https://www.google.com/search?q=inurl%3Ahttps%3A%2F%2Fsimplemaps.com%2Fdata%2F' World City Database]
| Database of cities with information of population and general Lat Long
| Database of cities with information of population and general Lat Long
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 100: Line 82:
| Global
| Global
|-
|-
| style="text-align: left;"| [[https://tradingeconomics.com/indicators TradingEconomics]]
| style="text-align: left;"| [https://tradingeconomics.com/indicators TradingEconomics]
| Mass database of metrics and indicators by country over time
| Mass database of metrics and indicators by country over time
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 106: Line 88:
| Global
| Global
|-
|-
| style="text-align: left;"| [[https://gitnux.org/topics/statistics/crime-and-safety-statistics/ GitNux Crime Reports]]
| style="text-align: left;"| [https://gitnux.org/topics/statistics/crime-and-safety-statistics/ GitNux Crime Reports]
| Crime reports and stats
| Crime reports and stats
| style="text-align: right;"|
| style="text-align: right;"|
Line 112: Line 94:
| Global
| Global
|-
|-
| style="text-align: left;"| [[https://radar.cloudflare.com/ Cloudflare Radar]]
| style="text-align: left;"| [https://radar.cloudflare.com/ Cloudflare Radar]
| A view of outages, threats, rankings and more based on the massive amount of cloudflare data
| A view of outages, threats, rankings and more based on the massive amount of cloudflare data
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 118: Line 100:
| Global
| Global
|-
|-
| style="text-align: left;"| [[https://vision.in.tum.de/data TUM Data]]
| style="text-align: left;"| [https://vision.in.tum.de/data TUM Data]
| Large collection of data sets for computer vision research
| Large collection of data sets for computer vision research
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 124: Line 106:
|
|
|-
|-
| style="text-align: left;"| [[https://repositorio.cepal.org/server/api/core/bitstreams/2db8feef-29d6-4981-9741-9ad3154d3789/content CEPAL Cyber Attacks]]
| style="text-align: left;"| [https://repositorio.cepal.org/server/api/core/bitstreams/2db8feef-29d6-4981-9741-9ad3154d3789/content CEPAL Cyber Attacks]
| Cyber Attacks in LATAM
| Cyber Attacks in LATAM
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 130: Line 112:
| LATAM
| LATAM
|-
|-
| style="text-align: left;"| [[https://www.oecd.org/investment/statistics.htm OECD]]
| style="text-align: left;"| [https://www.oecd.org/investment/statistics.htm OECD]
| Foreign Direct Investment (FDI) Statistics
| Foreign Direct Investment (FDI) Statistics
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 136: Line 118:
| Global
| Global
|-
|-
| style="text-align: left;"| [[https://data.worldbank.org/ World Bank Data]]
| style="text-align: left;"| [https://data.worldbank.org/ World Bank Data]
| Economic Datasets
| Economic Datasets
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 142: Line 124:
| Global
| Global
|-
|-
| style="text-align: left;"| [[https://bimportal.scottishfuturestrust.org.uk/page/roi-calculator Scottish Futures Trust ROI Calculator]]
| style="text-align: left;"| [https://bimportal.scottishfuturestrust.org.uk/page/roi-calculator Scottish Futures Trust ROI Calculator]
| Calculator that allows the user to calculate the expected return on investment of a building project
| Calculator that allows the user to calculate the expected return on investment of a building project
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 148: Line 130:
|
|
|-
|-
| style="text-align: left;"| [[https://www.numbeo.com/cost-of-living/ Numbeo]]
| style="text-align: left;"| [https://www.numbeo.com/cost-of-living/ Numbeo]
| cost of living calculator and comparison tool. Useful for determining the average price around the world.
| cost of living calculator and comparison tool. Useful for determining the average price around the world.
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 154: Line 136:
| Global
| Global
|-
|-
| style="text-align: left;"| [[https://ai.reportlinker.com/pricing Reportlinker]]
| style="text-align: left;"| [https://ai.reportlinker.com/pricing Reportlinker]
| AI enabled Market Intelligence Platform
| AI enabled Market Intelligence Platform
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 160: Line 142:
| Global
| Global
|-
|-
| style="text-align: left;"| [[https://www.kaggle.com/datasets Kaggle]]
| style="text-align: left;"| [https://www.kaggle.com/datasets Kaggle]
| Data repository with many datasets for competitions
| Data repository with many datasets for competitions
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 166: Line 148:
| Global
| Global
|-
|-
| style="text-align: left;"| [[https://www.youtube.com/ YouTube]]
| style="text-align: left;"| [https://www.youtube.com/ YouTube]
| Video Searchable Database of Machine Learning Videos
| Video Searchable Database of Machine Learning Videos
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 172: Line 154:
| Global
| Global
|-
|-
| style="text-align: left;"| [[https://www.amazon.com/AWS/Amazon-S3 Amazon S3]]
| style="text-align: left;"| [https://registry.opendata.aws AWS Open Datasets]
| Various datasets provided by Amazon Web Services
| Various datasets provided by Amazon Web Services
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 178: Line 160:
| Global
| Global
|-
|-
| style="text-align: left;"| [[https://r-nd.ami.btu.de/ AMI Data Set]]
| style="text-align: left;"| [https://r-nd.ami.btu.de/ AMI Data Set]
| This dataset comprises data from a wide range of sources including the finance sector.
| This dataset comprises data from a wide range of sources including the finance sector.
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 184: Line 166:
| Global
| Global
|-
|-
| style="text-align: left;"| [[https://bostondata.org/ Boston Data]]
| style="text-align: left;"| [https://bostondata.org/ Boston Data]
| Boston city datasets
| Boston city datasets
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 190: Line 172:
| Boston
| Boston
|-
|-
| style="text-align: left;"| [[https://www.socialeurope.eu/ Social Europe]]
| style="text-align: left;"| [https://www.socialeurope.eu/ Social Europe]
| Social Data
| Social Data
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 196: Line 178:
| Global
| Global
|-
|-
| style="text-align: left;"| [[https://www.census.gov/ US Census]]
| style="text-align: left;"| [https://www.census.gov/ US Census]
| Census Data
| Census Data
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 202: Line 184:
| USA
| USA
|-
|-
| style="text-align: left;"| [[https://data.gov/ US Government Data]]
| style="text-align: left;"| [https://data.gov/ US Government Data]
| Government Data
| Government Data
| style="text-align: right;"| Y
| style="text-align: right;"| Y
Line 208: Line 190:
| USA
| USA
|-
|-
| style="text-align: left;"| [[https://www.data.gov.uk/ UK Government Data]]
| style="text-align: left;"| [https://www.data.gov.uk/ UK Government Data]
| Government Data
| Government Data
| style="text-align: right;"| Y
| style="text-align: right;"| Y
| Government
| Government
| UK
| UK
|-
|[https://newsdata.io/search-news NewsData.io]
|API and Dashboard to get news from any country in chronological order. The dashboard works without a login.
|Y
|News
|Global
|-
|'''[https://forum.irregularchat.com/t/list-of-chinese-firms-operating-in-the-us/966/1 Designation of Chinese Military Companies]'''
|Notice of Chinese military companies operating in the United States was published 01/07/2025. This list is public and may serve a purpose for general research
|Y
|Military & Business
|China
|}
|}



Latest revision as of 16:27, 9 January 2025

Datasets

Finding Datasets

  1. DataSet Tag on IrregularChat Forum

Queries for Datasets

Case Studies and Projects Using Datasets

Known Datasets

URL Comments Free (Y/N) Category Region
Smithsonian Library Resources This list includes databases, collections and search tools, selected by Smithsonian Libraries staff, that are freely available via the Internet. Y Academic Global
CrossSub
Alt Link
micro-level, subnational event data on armed conflict and contention around the world Y Conflict Global
ACLE real-time data on the locations, dates, actors, fatalities, and types of all reported political violence and protest events around the world. N Conflict Global
OSMP Open Source Munitions Portal (OSMP) A new open-source portal was just launched today by Airwars and Arms Research. Incredibly useful database, particularly for anyone covering armed conflicts or wars. Y Conflict N/A
LiveUA factual reporting of a variety of important topics including conflicts, human rights issues, protests, terrorism, weapons deployment, health matters, natural disasters, and weather related stories, among others, from a vast array of sources Y Conflict Ukraine
Venezuelan Violence Data Y Conflict LATAM - Mexico
World City Database Database of cities with information of population and general Lat Long Y Country Data Global
TradingEconomics Mass database of metrics and indicators by country over time Y Country Data Global
GitNux Crime Reports Crime reports and stats Crime Global
Cloudflare Radar A view of outages, threats, rankings and more based on the massive amount of cloudflare data Y Cyber Global
TUM Data Large collection of data sets for computer vision research Y Cyber
CEPAL Cyber Attacks Cyber Attacks in LATAM Y Cyber LATAM
OECD Foreign Direct Investment (FDI) Statistics Y Finance & Business Global
World Bank Data Economic Datasets Y Finance & Business Global
Scottish Futures Trust ROI Calculator Calculator that allows the user to calculate the expected return on investment of a building project Y Finance & Business
Numbeo cost of living calculator and comparison tool. Useful for determining the average price around the world. Y Finance & Business Global
Reportlinker AI enabled Market Intelligence Platform Y Finance & Business Global
Kaggle Data repository with many datasets for competitions Y Machine Learning Global
YouTube Video Searchable Database of Machine Learning Videos Y Machine Learning Global
AWS Open Datasets Various datasets provided by Amazon Web Services Y Data Storage Global
AMI Data Set This dataset comprises data from a wide range of sources including the finance sector. Y Data Storage Global
Boston Data Boston city datasets Y City Data Boston
Social Europe Social Data Y Social Global
US Census Census Data Y Demographics USA
US Government Data Government Data Y Government USA
UK Government Data Government Data Y Government UK
NewsData.io API and Dashboard to get news from any country in chronological order. The dashboard works without a login. Y News Global
Designation of Chinese Military Companies Notice of Chinese military companies operating in the United States was published 01/07/2025. This list is public and may serve a purpose for general research Y Military & Business China

Research Data Datasets