A natureza de conjuntos de dados científicos em repositórios sul-americanos: um levantamento de formatos e extensões

Document

Citado por

Autor	Marcello Mundim Rodrigues, Cíntia de Azevedo Lourenço, Guilherme Ataíde Dias
Cargo	Doutorando em Gestão e Organização do Conhecimento pela Universidade Federal de Minas Gerais / Doutora em Ciência da Informação pela Universidade Federal de Minas Gerais Professora associada Universidade Federal de Minas Gerais, Escola de Ciência da Informação, Belo Horizonte, Brasil / Pós-doutorado em Ciência da Informação pela UNESP Professor...
Páginas	204-229

Encontros Bibli: revista eletrônica de biblioteconomia e ciência da informação, Florianópolis, v. 27, p. 01-26, 2022.

Universidade Federal de Santa Catarina. ISSN 1518-2924. DOI: ht tps://doi.org/10.5007/1518-2924.2022.e85148

Artigo Original

A NATUREZA DE CONJUNTOS DE DADOS CIENTÍFICOS

EM REPOSITÓRIOS SUL-AMERICANOS: UM

LEVANTAMENTO DE FORMATOS E EXTENSÕES

The nature of scientific datasets in South American repositories: a survey of formats and extensions

Marcello Mundim Rodrigues

Doutorando em Gestão e Organização do Conhecimento pela

Universidade Federal de Minas Gerais

marcellomundim@ufu.br

https://orcid.org/0000-0001-7945-6673

Cíntia de Azevedo Lourenço

Doutora em Ciência da Informação pela Universidade Federal de

Minas Gerais

Professora associada

Universidade Federal de Minas Gerais, Escola de Ciência da

Informação, Belo Horizonte, Brasil

cintia.eci.ufmg@gmail.com

https://orcid.org/0000-0002-2172-7300

Guilherme Ataíde Dias

Pós-doutorado em Ciência da Informação pela UNESP

Professor associado III

Universidade Federal da Paraíba, Departamento de Ciência da

Informação, João Pessoa, Pb, Brasil

guilhermeataide@gmail.com

https://orcid.org/0000-0001-6576-0017

A lista completa com informações dos autores está no final do artigo

RESUMO

Objetivo: identificar os repositórios de dados científicos criados e geridos por Instituições de Ensino Superior e/ou

agências de pesquisa e fomento sul-americanas; identificar e descrever os formatos e extensões dos arquivos que

compõem os conjuntos de dados científicos depositados nesses repositórios.

Método: oito repositórios recuperados pelo RE3DATA foram selecionados à investigação. Obteve-se uma população (N)

de 1.115 conjuntos de dados científicos. A partir da Amostragem Aleatória Estratificada, chegou-se ao valor da amostra

(n) igual a 258 conjuntos de dados, que corresponde a 23,15% da população (N). Os dados retirados das amostras foram

condensados em tabelas e quadros.

Resultado: notou-se que a natureza dos conjuntos de dados científicos investigados se concentra em dados textuais e

numéricos, salvos em arquivos de texto e em tabelas, re spectivamente. Percebeu-se que os conjuntos de dados podem

ser tanto homogêneos (um ou mais arquivos salvos em um ú nico formato e extensão, ex.: formato de imagem em .jpg)

ou heterogêneos (arquivos salvos em diferentes formatos e extensões, ex.: mesmo formato de imagem salvo em .jpg e

.tiff) em sua composição. Apurou-se também que algumas extensões possibilitam a identificação da natureza, do domínio

e do conteúdo dos dados, como observado nas extensões .gpx e .gdb, que se referem a dados de geolocalização, logo,

de natureza alfanumérica.

Conclusões: há crescente necessidade de se descrever a natureza dos dados, assim como os formatos e extensões de

seus arquivos. Esse tipo de metadado descritivo seria valioso a potenciais usuários, pois permitiria obter maior

compreensão do contexto dos dados com foco em seu reúso.

Palavras-chave: dados científicos; conjuntos de dados; repositórios de dados; formatos e extensões; levantamento.

ABSTRACT

Objective: identifying the scientific data repositories created and managed by Higher Education Institutions and/or South

American research and fun ding agencies; identifying and describing the formats and extensions of files that compile the

scientific datasets deposited in these repositories.

Methods: eight repositories retrieved by RE3DATA were selected for investigation. A population (N) of 1.115 scientific

datasets was obtained. By using Stratified Random Sampling, the resulting sample (n) value was 258 datasets, which

corresponds to 23,15% of the population (N). Data surveyed from the samples were condensed into tables and charts.

Results: it was noticed that the nature of the scientific datasets investigated is centered on textual and numerical data,

saved in text files and tables, respectively. Also, the datasets may be either homogeneous (one or more files saved in a

unique format and extension, e.g.: image format in .jpg) or heterogeneous (files saved in different formats and extensions,

Encontros Bibli: revista eletrônica de biblioteconomia e ciência da informação, Florianópolis, v. 27, p. 01-26, 2022.

Universidade Federal de Santa Catarina. ISSN 1518-2924. DOI: ht tps://doi.org/10.5007/1518-2924.2022.e85148

Artigo Original

e.g.: same image format

saved in .jpg and .tiff) in their

composition. It was found

that some extensions

enable the identification of

the nature, domain and

content of the data, as observed in the .gpx and gdb extensions, which refer to geospatial data, therefore, alphanumeric

data.

Conclusions: There is a growing need of describing the nature of data, as well as the formats and extensions of files. This

kind of descriptive metadata would be valuable to potential users, as it would allow a greater understanding of the context

of the data, focusing on data reuse.

Keywords: scientific data; datasets; data repositories; formats and extensions; survey.

1 INTRODUÇÃO

Dentro do processo evolutivo da ciência, observaram-se períodos na história

humana que se destacaram pela maneira que a prática científica foi conduzida. Num

primeiro momento, foram feitos experimentos e observações sobre o comportamento

natural das coisas do mundo físico e passíveis de análise, o que se denominou ciência

empírica. O empirismo trabalha variáveis distintas, em ambientes controlados ou não, e que

busca validar ou refutar correlações, como causa-efeito.

Como consequência da experiência empírica, surgiu então um paradigma apoiado

na observação e experimentação com a pretensão de testar hipóteses. Assim, hipóteses,

teorias e leis foram fruto desse processo científico que se manteve e perdurou por séculos,

até sofrer modificações a partir de meados do século XX.

Desde então, a humanidade usufrui da computação para gerar simulações e análises

de dados em seus experimentos, observações e testes de hipóteses, contando inicialmente

com o uso de computadores robustos e com baixas capacidades de armazenamento e

processamento. Com o passar das décadas, essas máquinas tiveram sua capacidade de

processamento melhorada de forma a assistir mais efetivamente os pesquisadores em seu

fazer diário, melhorando o tempo gasto e a qualidade da análise dos dados coletados.

É a partir de meados da década de 1940, com o cenário do fim da segunda guerra

mundial e começo da guerra fria, que o mundo observou o início de uma corrida

armamentista e tecnológica, além de disputas de influências e de territórios entre os

Estados Unidos e a antiga União Soviética, o que impactou na velocidade do

desenvolvimento computacional e industrial.

Surgem também outros avanços em Ciência e Tecnologia (C&T) no período pós-

segunda guerra, tais como pesquisas no uso de energia nuclear, a terceira revolução

industrial, as Tecnologias de Informação e Comunicação (TICs), entre elas a Web e a

Internet, ou seja, serviços estratégicos de inteligência numa disputa de poder entre

potências.

Para continuar a ler

PEÇA SUA AVALIAÇÃO

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Por que se inscrever na vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Por que se inscrever na vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Por que se inscrever na vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Os assinantes podem ver as versões revisadas da legislação com emendas

You can sign up for a trial and make the most of our service including these benefits.

Por que se inscrever na vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Por que se inscrever na vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Por que se inscrever na vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Por que se inscrever na vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

A natureza de conjuntos de dados científicos em repositórios sul-americanos: um levantamento de formatos e extensões

You can sign up for a trial and make the most of our service including these benefits.

Por que se inscrever na vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Por que se inscrever na vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Por que se inscrever na vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Por que se inscrever na vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Por que se inscrever na vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Por que se inscrever na vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Por que se inscrever na vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users