JSONLines (.jsonl)
Background & Context
-
- Registered MIME types: application/jsonl
- JSON Lines data format.
- Consists of multiple JSON objects, one per line, representing individual data rows.
- Commonly used for storing structured JSON data.
- Text file format.
Import & Export
- Import["file.jsonl"] imports a JSONLines file as a Tabular object.
- Import["file.jsonl",elem] imports the specified elements.
- Import["file.jsonl",{elem,subelem1,…}] imports subelements subelemi, useful for partial data import.
- The import format can be specified with Import["file","JSONL"] or Import["file",{"JSONL",elem,…}].
- Export["file.jsonl",expr] creates a JSONLines file from expr.
- Supported expressions expr include:
-
{v1,v2,…} a single column of data {{v11,v12,…},{v21,v22,…},…} lists of rows of data array an array such as SparseArray, QuantityArray, etc. tseries a TimeSeries, EventSeries or a TemporalData object dataset a Dataset or a Tabular object - See the following reference pages for full general information:
-
Import, Export import from or export to a file CloudImport, CloudExport import from or export to a cloud object ImportString, ExportString import from or export to a string ImportByteArray, ExportByteArray import from or export to a byte array
Import Elements
- General Import elements:
-
"Elements" list of elements and options available in this file "Summary" summary of the file "Rules" list of rules for all available elements - Data representation elements:
-
"Data" two-dimensional array "Dataset" table data as a Dataset "EventSeries" table data as an EventSeries "Tabular" a Tabular object "TimeSeries" table data as a TimeSeries - Import by default uses the "Tabular" element.
- Subelements for partial data import for the "Tabular" element can take row and column specifications in the form {"Tabular",rows,cols}, where rows and cols can be any of the following:
-
n nth row or column -n counts from the end n;;m from n through m n;;m;;s from n through m with steps of s {n1,n2,…} specific rows or columns ni - Column specifications can also be any of the following:
-
"col" single column "col" {col1,col2,…} list of column names coli - Data descriptor elements:
-
"ColumnLabels" names of columns "ColumnTypes" association with data type for each column "Schema" TabularSchema object - Metadata elements:
-
"ColumnCount" number of columns stored in file "Dimensions" data dimensions "RowCount" number of rows stored in file "MetaInformation" metadata
Import Options
- General Import options:
-
"BlockSize" 1048576 how many bytes to process at a time "Schema" Automatic schema used to construct Tabular object "TimeColumn" Automatic column to use for times in "EventSeries" and "TimeSeries" elements - Possible settings for the "Schema" option include:
-
schema a complete TabularSchema specification propval a schema property and value (see reference page for TabularSchema) <|"prop1"val1,…|> an association of schema properties and values
Export Options
- Export options include:
-
"ExpressionFormattingFunction" Automatic how expressions stored in a Tabular object are converted to strings - "ExpressionFormattingFunction" can be set to the following values:
-
Automatic default conversion to string form any form supported by Format such as InputForm f arbitrary function that converts an expression to a string
Examples
open all close allBasic Examples (3)
Import Tabular object from JSONLines file:
Import["ExampleData/world_cups.jsonl"]Import["ExampleData/world_cups.jsonl", "Summary"]Export Tabular object to JSONLines file:
tabular = Import["ExampleData/world_cups.jsonl"];Export["file.jsonl", tabular]Scope (3)
Import (3)
Show all elements available in the file:
Import["ExampleData/world_cups.jsonl", "Elements"]By default, a Tabular object is returned:
Import["ExampleData/world_cups.jsonl"]//TabularQImport["ExampleData/world_cups.jsonl", "ColumnTypes"]Import Elements (18)
"ColumnCount" (1)
"ColumnLabels" (1)
"ColumnTypes" (1)
"Data" (3)
Import["ExampleData/world_cups.jsonl", "Data"]Import["ExampleData/world_cups.jsonl", {"Data", 1 ;; 2}]Import["ExampleData/world_cups.jsonl", {"Data", All, {1, 2}}]Import selected columns using column names:
Import["ExampleData/world_cups.jsonl", {"Data", All, {"Year", "Host Country"}}]"Dataset" (3)
Get the data as a Dataset:
Import["ExampleData/world_cups.jsonl", "Dataset"]Import["ExampleData/world_cups.jsonl", {"Dataset", 1 ;; 2}]Import["ExampleData/world_cups.jsonl", {"Dataset", All, {1, 2}}]Import selected columns using column names:
Import["ExampleData/world_cups.jsonl", {"Dataset", All, {"Year", "Host Country"}}]"Dimensions" (1)
"EventSeries" (1)
Import a JSONLines file as an EventSeries:
Import["ExampleData/world_cups.jsonl", "EventSeries"]Import a single row from a JSONLines file:
Import["ExampleData/world_cups.jsonl", {"EventSeries", 5}]Import some specific rows from a JSONLines file:
Import["ExampleData/world_cups.jsonl", {"EventSeries", {1, 5, 7}}]Import the first 10 rows of a JSONLines file:
Import["ExampleData/world_cups.jsonl", {"EventSeries", 1 ;; 10}]Import only selected columns using column names:
Import["ExampleData/world_cups.jsonl", {"EventSeries", All, {"Year", "Host Country", "Winner"}}]"RowCount" (1)
"Schema" (1)
Get the TabularSchema object:
Import["ExampleData/world_cups.jsonl", "Schema"]"Summary" (1)
"Tabular" (3)
Get the data from a file as a Tabular object:
Import["ExampleData/world_cups.jsonl", "Tabular"]Import["ExampleData/world_cups.jsonl", {"Tabular", 1 ;; 2}]Import["ExampleData/world_cups.jsonl", {"Tabular", All, {1, 2}}]Import selected columns using column names:
Import["ExampleData/world_cups.jsonl", {"Tabular", All, {"Year", "Host Country"}}]"TimeSeries" (1)
Import a JSONLines file as a TimeSeries:
Import["ExampleData/world_cups.jsonl", "TimeSeries"]Import a single row from a JSONLines file:
Import["ExampleData/world_cups.jsonl", {"TimeSeries", 5}]Import some specific rows from a JSONLines file:
Import["ExampleData/world_cups.jsonl", {"TimeSeries", {1, 5, 7}}]Import the first 10 rows of a JSONLines file:
Import["ExampleData/world_cups.jsonl", {"TimeSeries", 1 ;; 10}]Import only selected columns using column names:
Import["ExampleData/world_cups.jsonl", {"TimeSeries", All, {"Year", "Host Country", "Winner"}}]Import Options (3)
"BlockSize" (1)
Export["trees.jsonl", ResourceData["Sample Tabular Data: NYC Trees"]];Import data using the default setting of "BlockSize"->2^20:
RepeatedTiming[Import["trees.jsonl"];]Import data faster by specifying the value of the "BlockSize" option:
AssociationMap[First@RepeatedTiming[Import["trees.jsonl", "BlockSize" -> 2 ^ #];]&, {19, 21, 22}]"Schema" (1)
file = Export["out.jsonl", Tabular[Association["RawSchema" -> Association["ColumnProperties" ->
Association["A" -> Association["ElementType" -> "String"],
"B" -> Association["ElementType" -> "String"]], "KeyColumns" -> None,
"Backend" -> "WolframKernel"], "BackendData" ->
Association["ColumnData" -> DataStructure["ColumnTable",
{{TabularColumn[Association["Data" -> {{0, {0, 11, 22, 33, 44, 55},
"Jan 03 2006Jan 04 2006Jan 05 2006Jan 06 2006Jan 09 2006"}, {}, None},
"ElementType" -> "String"]], TabularColumn[Association[
"Data" -> {{3, {0, 5, 10, 15, 20, 25}, "11.8212.0412.0911.8812.43"}, {}, None},
"ElementType" -> "String"]]}}]]]]];By default, column labels and their types stored in a file are used when Tabular or Dataset objects are imported:
tabular = Import[file];
tabular["ColumnTypes"]Use the "Schema" option to specify column labels and types:
tabular = Import[file, "Schema" -> {"ColumnKeys" -> {"Date", "Value"}, "ElementType" -> {"Date" -> "Date", "Value" -> "Real32"}}]"TimeColumn" (1)
Export a Tabular object to a JSONLines file:
file = Export["file.jsonl", Tabular[Association["RawSchema" -> Association["ColumnProperties" ->
Association["Date" -> Association["ElementType" -> TypeSpecifier["Date"]["Integer32", "Day",
"Gregorian", None]], "Value" -> Association["ElementType" -> "Real32"]],
"KeyColumns" -> None, "Backend" -> "WolframKernel"], "Options" -> {},
"BackendData" -> Association["ColumnData" -> DataStructure["ColumnTable",
{{TabularColumn[Association["Data" -> {5, {{NumericArray[{13150, 13151, 13152, 13153, 13156},
"Integer32"], {}, None}}, None}, "ElementType" -> "Date"["Integer32", "Day",
"Gregorian", None]]], TabularColumn[Association[
"Data" -> {NumericArray[{11.819999694824219, 12.039999961853027, 12.09000015258789,
11.880000114440918, 12.430000305175781}, "Real32"], {}, None},
"ElementType" -> "Real32"]]}}]]]]];By default, the time column is selected automatically for "TimeSeries" and "EventSeries" elements:
Import[file, "TimeSeries"]Use the "TimeColumn" option to specify the time column:
Import[file, "TimeSeries", "TimeColumn" -> "Value"]Export Options (1)
"ExpressionFormattingFunction" (1)
By default, Export uses different conversion to string depending on column types:
tabular = Import["ExampleData/USstates.arrow", {"Tabular", 1 ;; 3}]ColumnTypes[tabular]ExportString[tabular, "JSONLines"]Use "ExpressionFormattingFunction"->InputForm to get string versions of expressions suitable for input to the Wolfram Language:
ExportString[tabular, "JSONLInes", "ExpressionFormattingFunction" -> InputForm]//Short[#, 5]&Use an arbitrary function such as SpokenString:
ExportString[tabular, "JSONLines", "ExpressionFormattingFunction" -> SpokenString]Use a function that specifies different rules for different expression types:
Clear[f]
f[x_Image] := ImageIdentify[x];
f[x_] := ToString[x, OutputForm]ExportString[tabular, "JSONLines", "ExpressionFormattingFunction" -> f]Applications (1)
Import a large JSONLines file from Kaggle:
data = DataConnectionObject[<|"ConnectionName" -> "Kaggle", "Location" -> "https://www.kaggle.com/datasets/devdope/900k-spotify?select=900k+Definitive+Spotify+Dataset.json"|>];Import[data, "JSONLines"]Dimensions[%]See Also
History
Introduced in 2025 (14.3) | Updated in 2026 (15.0)