{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# TSV\n", "\n", ">A [tab-separated values (TSV)](https://en.wikipedia.org/wiki/Tab-separated_values) file is a simple, text-based file format for storing tabular data.[3] Records are separated by newlines, and values within a record are separated by tab characters." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## `UnstructuredTSVLoader`\n", "\n", "You can also load the table using the `UnstructuredTSVLoader`. One advantage of using `UnstructuredTSVLoader` is that if you use it in `\"elements\"` mode, an HTML representation of the table will be available in the metadata." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from langchain.document_loaders.tsv import UnstructuredTSVLoader" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "loader = UnstructuredTSVLoader(\n", " file_path=\"example_data/mlb_teams_2012.csv\", mode=\"elements\"\n", ")\n", "docs = loader.load()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "
Nationals, 81.34, 98 | \n", "
Reds, 82.20, 97 | \n", "
Yankees, 197.96, 95 | \n", "
Giants, 117.62, 94 | \n", "
Braves, 83.31, 94 | \n", "
Athletics, 55.37, 94 | \n", "
Rangers, 120.51, 93 | \n", "
Orioles, 81.43, 93 | \n", "
Rays, 64.17, 90 | \n", "
Angels, 154.49, 89 | \n", "
Tigers, 132.30, 88 | \n", "
Cardinals, 110.30, 88 | \n", "
Dodgers, 95.14, 86 | \n", "
White Sox, 96.92, 85 | \n", "
Brewers, 97.65, 83 | \n", "
Phillies, 174.54, 81 | \n", "
Diamondbacks, 74.28, 81 | \n", "
Pirates, 63.43, 79 | \n", "
Padres, 55.24, 76 | \n", "
Mariners, 81.97, 75 | \n", "
Mets, 93.35, 74 | \n", "
Blue Jays, 75.48, 73 | \n", "
Royals, 60.91, 72 | \n", "
Marlins, 118.07, 69 | \n", "
Red Sox, 173.18, 69 | \n", "
Indians, 78.43, 68 | \n", "
Twins, 94.08, 66 | \n", "
Rockies, 78.06, 64 | \n", "
Cubs, 88.19, 61 | \n", "
Astros, 60.65, 55 | \n", "