Quickly import data
In case the data structure is simple you can bypass the data modeling part as described in the structured import. Instead, the Quick data import will automatically derive the data structure (with some obvious limitations). There are some assumptions we make with regards file types and file contents.
- The supported file types are CSV, XLS, XLSX, and a ZIP file containing CSV files
- A file consists of a at least two lines, the first is the header, the rest is data
We always expect the first line to be the header
Empty files can not be imported
- Files containing only a header can not be uploaded
- You need a package you can write content into, usually the package of the group you are a member of
Metadata guessing
The quick data import tries to guess what type your data is. MOLGENIS supports multiple data types like booleans, date fields, numbers etc.
- To get INTEGERS, use values like 1, 30, 2012, etc...
- To get LONG, use values like 128938972487837384
- To get DECIMAL, use values like 2.3, 2901.3, or 123.8
- To get DATE, submit strings in the format of dd/mm/yyyy
- To get BOOLEAN, submit strings in the format TRUE or FALSE
- Other values are inserted into the database as STRING. If a piece of text is longer then 255 characters, we use TEXT
Data structure
MOLGENIS has an explicit data structure within the application. You need to know the basic terminology to understand how the data is structured.
Terminology
In this section we introduce and explain the terminology concerning data structure of MOLGENIS.
- Groups: a group of people who manage data within one package (folder).
- Package: Each group has a root package where it can store its data. Packages can have child packages to logically subdivide that root package into a tree structure, like folders on a hard drive.
- Entity Type: An entity type is the metadata of a data collection, like a table in a database.
- Entity: The actual data that is collected based on the template from an entity type, like a table row in a database.
- Attribute: An attribute describes the characteristics of a data item in an entity type, like a column in a database
Data is imported into the MOLGENIS database as a single entity (table) entities are grouped within packages (folders) Your entities are stored within a package that you can write content into, usually your "group" package
The base folder in which all other entities and packages are placed is dependent of the group you are part of.
- In the case of an excel, the file name is used as the package name and the workbook sheet is used as the entity name. Packages will be created as children of the first writable package, usually the package of the group you are in.
- In the case of a CSV, the file is used as the package and the entity name. The package will be created as child of the first writable package, usually the package of the group you are in.
- In the case of a ZIP file, the name of the ZIP file is used as the package name, and the names of the files inside the ZIP are used as the names for the entities. The package will be created as child of the first writable package, usually the package of the group you are in.
You can move the package to the location you want after the import is done.
How to use
- Click upload file
- Select a file
- Wait...
- Done!
There are two types of links:
- The file name will send you to the MOLGENIS navigator. Here you can view all packages, and the corresponding data tables.
- The nested links (sheet names for Excel, file names for CSV) will take you to the data explorer. Here you can view, filter, query, download, and share your data