Releases · nomic-ai/nomic

allow indexed_field is none for image datasets with create_index (#315) (2d7d2da)
api naming consistency, version (8b9d37a)
assert image right type (469f44a)
bugs introduced after adding types (56d298e)
change file name (552ec75)
check format in nomic/ folder (d96ca3a)
dataset sidecar download - datum id sidecars are special (24e7d94)
don't use b64 encode (338a5f4)
fetch db-registered topic sidecars (df8a75a)
max image request (41c459d)
move indexed_field check earlier for image dataset (#320) (4b93268)
nullable parameter (f20f7c9)
outofbounds day (5f17f95)
outofbounds day (43da768)
parsing problem (0fdc178)
Path and Tuple type isssues (75f7767)
problems with pa.compute (1c3a8ef)
remove libcairo (033c894)
remove npm ci (#329) (9640ded)
remove pdb (f8d2050)
resizing logic bug (#306) (11ab3c5)
respect modality in create_index (#317) (fd3c108)
return model name, not str model (c9f9703)
spelling (d0a3ce5)
text mode truncation param (f6572ac)
topic label field default to indexed field if not supplied (#316) (8bc157d)
type issues after rebasing (12c1b08)
typing (8086591)
typing + resize from file (ef9703a)
update example (2681a66)
update min dim (35a491d)
use Optional (d4d5eb3)
wait for project log (9cacd4a)

Documentation

update docs to make more clear that blobs are stored locally only (#318) (aa863ec)

Assets 2

23 Jan 20:30

AndriyMulyar

v3.0.6

f8ddd74

v3.0.6: Task type for text embeddings

Allows specifying task type for text embeddings.

Assets 2

22 Jan 14:58

AndriyMulyar

v3.0.5

312b151

Nomic Client 3.0.5: API key support, faster embedding inference

Support for using Nomic API keys as your authentication method by running nomic login <api_key>
Faster text embedding inference

Assets 2

11 Jan 17:45

AndriyMulyar

v3.0.0

fda3ccc

Nomic Client 3.0.0: Slugs, AtlasDataset and Developer Ergonomics

Nomic Client 3.0.0

New Features

All identifiers moved to unique, human-readable, URL valid organization/project slugs. These are auto-created for you on dataset creation and are ensured to be unique across your Atlas organization

from nomic import AtlasDataset
dataset = AtlasDataset('my-organization/my-dataset')
print(dataset.maps[0])

AtlasProject renamed to AtlasDataset
Makes supplying an index_name optional (by being set to None)
map_data unifies easy interfaces for indexing datasets.

Deprecations:

map_embeddings, map_text in favor a single map_data
Deprecates iterable in map_text, map_data will no longer support this iterable workflow. Use AtlasDataset and .add_data instead.

Transitioning from 2.x to 3.x

Rename all map_text and map_embedding calls to map_data
Replace any use of AtlasProject with AtlasDataset
See examples in the python client examples folder for details.

Assets 2

09 Jun 13:26

AndriyMulyar

v1.1.14

c7e04de

v1.1.14: Topics and duplicate detection are accessible in client.

Visual map state can be accessed by manipulating downloading arrow files. See https://docs.nomic.ai/group_by_topics.html
Duplicate detection results can be accessed.

Assets 2

23 Mar 17:11

AndriyMulyar

v1.1.0

786345c

v1.1.0: Apache Arrow Standard Compatability

Atlas now relies on the Apache Arrow standard data validation and integrity
For users this means:

Pandas dataframes and Arrow tables can be passed in during upload and Atlas with automatically coerce data types.
Atlas will fail less due to data formatting and typing issues and provide more informative error messages when users input malformed inputs.
Atlas will be snappier to use due resulting improvements in over-the-wire latency.

Technical Details

Atlas stores and transfers data using a subset of the Apache Arrow standard.

pyarrow is used to convert python, pandas, and numpy data types to Arrow types;
you can also pass any Arrow table (created by polars, duckdb, pyarrow, etc.) directly to Atlas
and the types will be automatically converted.

Before being uploaded, all data is converted with the following rules:

Strings are converted to Arrow strings and stored as UTF-8.
Integers are converted to 32-bit integers. (In the case that you have larger integers, they are probably either IDs, in which case you should convert them to strings;
or they are a field that you want perform analysis on, in which case you should convert them to floats.)
Floats are converted to 32-bit (single-precision) floats.
Embeddings, regardless of precision, are uploaded as 16-bit (half-precision) floats, and stored in Arrow as FixedSizeList.
All dates and datetimes are converted to Arrow timestamps with millisecond precision and no time zone.
(If you have a use case that requires timezone information or micro/nanosecond precision, please let us know.)
Categorical types (called 'dictionary' in Arrow) are supported, but values stored as categorical must be strings.

Other data types (including booleans, binary, lists, and structs) are not supported.
Values stored as a dictionary must be strings.

All fields besides embeddings and the user-specified ID field are nullable.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3.1.1 (2024-07-22)

Bug Fixes

3.1.0 (2024-07-22)

Features

Bug Fixes

Documentation

New Features

Deprecations:

Transitioning from 2.x to 3.x

Releases: nomic-ai/nomic

v3.1.1

3.1.1 (2024-07-22)

Bug Fixes

v3.1.0

3.1.0 (2024-07-22)

Features

Bug Fixes

Documentation

v3.0.6: Task type for text embeddings

Nomic Client 3.0.5: API key support, faster embedding inference

Nomic Client 3.0.0: Slugs, AtlasDataset and Developer Ergonomics

New Features

Deprecations:

Transitioning from 2.x to 3.x

v1.1.14: Topics and duplicate detection are accessible in client.

v1.1.0: Apache Arrow Standard Compatability