quilt
Search…
Getting Data from a Package
The examples in this section use the aleksey/hurdat demo package:
1
import quilt3
2
p = quilt3.Package.browse('aleksey/hurdat', 's3://quilt-example')
3
p
Copied!
1
Loading manifest: 100%|██████████| 7/7 [00:00<00:00, 8393.40entries/s]
2
3
4
5
6
7
(remote Package)
8
└─.gitignore
9
└─.quiltignore
10
└─notebooks/
11
└─QuickStart.ipynb
12
└─quilt_summarize.json
13
└─requirements.txt
14
└─scripts/
15
└─build.py
Copied!

Slicing through a package

Use dict key selection to slice into a package tree:
1
# returns PackageEntry("requirements.txt")
2
p["requirements.txt"]
Copied!
1
PackageEntry('s3://quilt-example/aleksey/hurdat/requirements.txt?versionId=bQtxuZlaylNVHi0GmxkSMofT5qXJvP95')
Copied!
1
# returns (remote Package)
2
p["notebooks"]
Copied!
1
(remote Package)
2
└─QuickStart.ipynb
Copied!
Slicing into a Package directory returns another Package rooted at that subdirectory. Slicing into a package entry returns an individual PackageEntry.

Downloading package data to disk

To download a subset of files from a package directory to a dest, use fetch:
1
# download a subfolder
2
p["notebooks"].fetch()
3
4
# download a single file
5
p["notebooks"]["QuickStart.ipynb"].fetch()
6
7
# download everything
8
p.fetch()
Copied!
1
Copying objects: 100%|██████████| 36.7k/36.7k [00:01<00:00, 22.7kB/s]
2
100%|██████████| 36.7k/36.7k [00:01<00:00, 24.1kB/s]
3
Copying objects: 100%|██████████| 39.9k/39.9k [00:02<00:00, 16.5kB/s]
4
5
6
7
8
9
(local Package)
10
└─.gitignore
11
└─.quiltignore
12
└─notebooks/
13
└─QuickStart.ipynb
14
└─quilt_summarize.json
15
└─requirements.txt
16
└─scripts/
17
└─build.py
Copied!
fetch will default to downloading the files to the current directory, but you can also specify an alternative path:
1
p["notebooks"]["QuickStart.ipynb"].fetch("./references/")
Copied!
1
100%|██████████| 36.7k/36.7k [00:01<00:00, 22.5kB/s]
2
3
4
5
6
7
PackageEntry('file:///Users/gregezema/Documents/programs/quilt/docs/Walkthrough/references/')
Copied!

Downloading package data into memory

Alternatively, you can download data directly into memory:
1
p["quilt_summarize.json"].deserialize()
Copied!
1
['notebooks/QuickStart.ipynb']
Copied!
To apply a custom deserializer to your data, pass the function as a parameter to the function. For example, to load a hypothetical yaml file using yaml.safe_load:
1
import yaml
2
# returns a dict
3
p["quilt_summarize.json"].deserialize(yaml.safe_load)
Copied!
1
['notebooks/QuickStart.ipynb']
Copied!
The deserializer should accept a byte stream as input.

Getting entry locations

You can get the path to a package entry or directory using get:
1
# returns /path/to/pkg/root/notebooks/QuickStart.ipynb
2
p["notebooks"]["QuickStart.ipynb"].get()
Copied!
1
's3://quilt-example/aleksey/hurdat/notebooks/QuickStart.ipynb?versionId=PH.9gsCH6LM9RQIqsy1U4X6H6s.VoQ_B'
Copied!

Getting metadata

Metadata is available using the meta property.
1
# get entry metadata
2
p["notebooks"]["QuickStart.ipynb"].meta
3
4
# get directory metadata
5
p["notebooks"].meta
6
7
# get package metadata
8
p.meta
Copied!
Last modified 1mo ago