See the frontier of multimodal model training by exploring this 11M document subset of OBELICS: an open collection of interleaved image-text web documents, containing 141M English documents, 115B text tokens, and 353M images. Created in a collaboration between Nomic and Huggingface.
Google's Vertex AI group used Nomic to visualize their embeddings of 8 million stack overflow posts. Explore where people get stuck while coding, and what answers they got from 'hello worlds' in every language to assembly language and Terraform configurations.