Issue #XXX Make progress on cohere and ollama rag demos

pull/1111/head
Rasmus Storjohann 3 months ago
parent e8b7884b6d
commit 4d3957052d

@ -0,0 +1,39 @@
import os
import cohere
from dotenv import load_dotenv
load_dotenv()
COHERE_API_KEY = os.getenv('COHERE_API_KEY')
co = cohere.Client(COHERE_API_KEY)
if False:
r = co.chat(
model="command",
message="Who is more popular: Nsync or Backstreet Boys?",
documents=[
{
"title": "CSPC: Backstreet Boys Popularity Analysis - ChartMasters",
"snippet": "↓ Skip to Main Content\n\nMusic industry One step closer to being accurate\n\nCSPC: Backstreet Boys Popularity Analysis\n\nHernán Lopez Posted on February 9, 2017 Posted in CSPC 72 Comments Tagged with Backstreet Boys, Boy band\n\nAt one point, Backstreet Boys defined success: massive albums sales across the globe, great singles sales, plenty of chart topping releases, hugely hyped tours and tremendous media coverage.\n\nIt is true that they benefited from extraordinarily good market conditions in all markets. After all, the all-time record year for the music business, as far as revenues in billion dollars are concerned, was actually 1999. That is, back when this five men group was at its peak."
},
{
"title": "CSPC: NSYNC Popularity Analysis - ChartMasters",
"snippet": "↓ Skip to Main Content\n\nMusic industry One step closer to being accurate\n\nCSPC: NSYNC Popularity Analysis\n\nMJD Posted on February 9, 2018 Posted in CSPC 27 Comments Tagged with Boy band, N'Sync\n\nAt the turn of the millennium three teen acts were huge in the US, the Backstreet Boys, Britney Spears and NSYNC. The latter is the only one we havent study so far. It took 15 years and Adele to break their record of 2,4 million units sold of No Strings Attached in its first week alone.\n\nIt wasnt a fluke, as the second fastest selling album of the Soundscan era prior 2015, was also theirs since Celebrity debuted with 1,88 million units sold."
},
{
"title": "CSPC: Backstreet Boys Popularity Analysis - ChartMasters",
"snippet": " 1997, 1998, 2000 and 2001 also rank amongst some of the very best years.\n\nYet the way many music consumers especially teenagers and young womens embraced their output deserves its own chapter. If Jonas Brothers and more recently One Direction reached a great level of popularity during the past decade, the type of success achieved by Backstreet Boys is in a completely different level as they really dominated the business for a few years all over the world, including in some countries that were traditionally hard to penetrate for Western artists.\n\nWe will try to analyze the extent of that hegemony with this new article with final results which will more than surprise many readers."
},
{
"title": "CSPC: NSYNC Popularity Analysis - ChartMasters",
"snippet": " Was the teen group led by Justin Timberlake really that big? Was it only in the US where they found success? Or were they a global phenomenon?\n\nAs usual, Ill be using the Commensurate Sales to Popularity Concept in order to relevantly gauge their results. This concept will not only bring you sales information for all NSYNCs albums, physical and download singles, as well as audio and video streaming, but it will also determine their true popularity. If you are not yet familiar with the CSPC method, the next page explains it with a short video. I fully recommend watching the video before getting into the sales figures."
}
])
print(r.citations )
if True:
r = co.chat(
model="command",
message="Who is more popular: Nsync or Backstreet Boys?",
connectors=[{"id": "web-search"}])
print(r)

@ -0,0 +1,21 @@
import os
import cohere
from dotenv import load_dotenv
load_dotenv()
COHERE_API_KEY = os.getenv('COHERE_API_KEY')
co = cohere.Client(COHERE_API_KEY)
result = co.chat(
model="command",
message="Where do the tallest penguins live?",
documents=[
{"title": "Tall penguins", "snippet": "Emperor penguins are the tallest."},
{"title": "Penguin habitats", "snippet": "Emperor penguins only live in Antarctica."},
{"title": "What are animals?", "snippet": "Animals are different from plants."}
])
print(25*'x')
print(result.citations)
print(25*'x')

@ -0,0 +1,71 @@
import os
import cohere
import csv
from dotenv import load_dotenv
load_dotenv()
COHERE_API_KEY = os.getenv('COHERE_API_KEY')
co = cohere.Client(COHERE_API_KEY)
documents = []
with open('data/noc.csv', newline='') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
id = row['Code - NOC 2021 V1.0']
record = {
'id': id,
'title': row['Class title'],
'snippet': 'id="' + id + '": ' + row['Class definition']
}
documents.append(record)
def include_page(document):
return document['id'].startswith('62')
selected_documents = [d for d in documents if include_page(d)]
admin_assistant_js = ("As an Administrative Assistant in the film industry, you will play a pivotal " +
"role in supporting the administrative and organizational functions of film " +
"production companies, studios, or related entities. You will be responsible for " +
" providing comprehensive administrative support to ensure smooth operations and " +
" facilitate the execution of various projects within the dynamic and fast-paced " +
" environment of the film industry.'");
hopital_chef_jd = ("Title: Hospital Chef\n" +
"Job Summary:\n" +
"As a Hospital Chef, you will play a vital role in ensuring the provision of high-quality, nutritious meals for patients, \n" +
"staff, and visitors in a healthcare setting. Working closely with dietitians, nutritionists, and culinary staff, you\n" +
" will be responsible for planning, preparing, and overseeing the production of meals that meet dietary requirements,\n" +
" taste preferences, and nutritional standards.\n" +
"Responsibilities:\n" +
"1. **Menu Planning:** Collaborate with dietitians and nutritionists to plan menus that meet the dietary needs of patients\n" +
" while adhering to medical guidelines and dietary restrictions.\n" +
"2. **Food Preparation:** Prepare and cook meals according to standardized recipes, ensuring consistency in taste, \n" +
"presentation, and portion sizes. Monitor food quality and taste to maintain high standards.\n" +
"3. **Nutritional Considerations:** Ensure that meals are balanced, nutritious, and appropriate for patients with\n" +
" specific medical conditions or dietary restrictions, such as diabetes, food allergies, or heart disease.\n" +
"4. **Food Safety and Sanitation:** Adhere to strict food safety and sanitation protocols to prevent contamination and \n" +
"ensure compliance with health regulations. Monitor kitchen hygiene, equipment maintenance, and food storage practices.\n" +
"5. **Inventory Management:** Oversee inventory levels of food and kitchen supplies, ordering ingredients and supplies \n" +
"as needed to maintain stock levels and minimize waste. Monitor food costs and budgetary constraints.\n" +
"6. **Team Leadership:** Supervise kitchen staff, including cooks, sous chefs, and kitchen assistants, providing \n" +
"guidance, training, and support to ensure efficient operations and teamwork.\n" +
"7. **Special Dietary Needs:** Accommodate special dietary requests from patients, staff, and visitors, \n" +
"including vegetarian, vegan, gluten-free, and other dietary preferences or restrictions.\n" +
"8. **Menu Development:** Continuously evaluate and update menus to incorporate seasonal ingredients, culinary\n" +
" trends, and feedback from patients and staff. Introduce new recipes and dishes to enhance the dining experience.\n" +
"9. **Patient Satisfaction:** Solicit feedback from patients and staff regarding meal quality, preferences,\n" +
" and satisfaction. Implement improvements and adjustments based on feedback to enhance the overall dining experience.\n" +
"10. **Regulatory Compliance:** Ensure compliance with regulatory agencies, such as the Department of Health, Joint Commission, and local health authorities")
result = co.chat(
model='command',
message='Which of the provided documents most closely match this job description: "' + hopital_chef_jd + '", include the corresponding ids in the response',
documents=selected_documents)
print(40*'*')
print(result.message)
print(40*'*')
print(result.text)
print(40*'*')

@ -21,8 +21,12 @@ with open('data/noc.csv', newline='') as csvfile:
}
noc_data.append(record)
# with 822 docs nice -n 19 python3 ollama-rag-example.py 7.96s user 1.58s system 13% cpu 1:12.98 total
# with 105 docs: nice -n 19 python3 ollama-rag-example.py 3.17s user 2.18s system 8% cpu 1:00.42 total
def include_page(page):
return True # page['code'].startswith('5')
return page['code'].startswith('2')
def to_page_content(page):
return 'code="' + page['code'] + '" title="' + page['title'] + '" definition="' + page['definition'] + '"'
@ -61,11 +65,12 @@ vectorstore = Chroma.from_documents(
retriever = vectorstore.as_retriever()
# 3. Before RAG
print("Before RAG\n")
before_rag_template = "What is {topic}"
before_rag_prompt = ChatPromptTemplate.from_template(before_rag_template)
before_rag_chain = before_rag_prompt | model_local | StrOutputParser()
print(before_rag_chain.invoke({"topic" : "trademark agents"}))
if False:
print("Before RAG\n")
before_rag_template = "What is {topic}"
before_rag_prompt = ChatPromptTemplate.from_template(before_rag_template)
before_rag_chain = before_rag_prompt | model_local | StrOutputParser()
print(before_rag_chain.invoke({"topic" : "trademark agents"}))
# 4. After rAG
print("\n###########\nAfter RAG")
@ -115,5 +120,20 @@ hopital_chef_jd = ("Title: Hospital Chef\n" +
" and satisfaction. Implement improvements and adjustments based on feedback to enhance the overall dining experience.\n" +
"10. **Regulatory Compliance:** Ensure compliance with regulatory agencies, such as the Department of Health, Joint Commission, and local health authorities")
geological_engineer_jd = ("""Title: Geological Engineer
Job Summary:
As a Geological Engineer, you will play a crucial role in assessing the geological conditions of sites and providing expertise in engineering projects related to natural resources exploration, environmental protection, infrastructure development, and hazard mitigation. Your responsibilities will involve analyzing geological data, conducting field surveys, and collaborating with multidisciplinary teams to ensure the safe and efficient execution of engineering projects.
Responsibilities:
1. **Site Investigation:** Conduct geological surveys and site investigations to assess geological features, including rock formations, soil composition, groundwater conditions, and potential hazards such as landslides, earthquakes, or sinkholes.
2. **Geological Mapping:** Create detailed geological maps and models using specialized software and mapping techniques to identify geological structures, mineral deposits, and potential risks for engineering projects.
3. **Geotechnical Analysis:** Perform geotechnical analyses to evaluate soil stability, bearing capacity, and slope stability for the design and construction of infrastructure projects, such as buildings, bridges, roads, and dams.
4. **Risk Assessment:** Assess geological risks and hazards associated with engineering projects, including seismic activity, soil erosion, groundwater contamination, and geological instabilities, and develop mitigation strategies to minimize risks.
5. **Environmental Impact Assessment:** Evaluate the environmental impact of engineering activities on natural ecosystems, water resources, and air quality, and recommend measures to mitigate negative impacts and ensure compliance with environmental regulations.
6. **Resource Exploration:** Assist in the exploration and extraction of natural resources, such as minerals, oil, gas, and water, by analyzing geological data, conducting drilling surveys, and identifying potential resource reserves.
7. **Project Planning:** Provide geological input and expertise during the planning and design phases of engineering projects, including site selection, foundation design, and construction techniques, to optimize project outcomes and minimize geological risks.
8. **Data Analysis:** Analyze geological data collected from field surveys, laboratory tests, and remote sensing""")
print(after_rag_chain.invoke("What is the title and code of the document that most closely matches this job description: '" + hopital_chef_jd + "'"))
print(after_rag_chain.invoke("What are the three documents that most closely match this job description: '" + geological_engineer_jd + "'. Answer in JSON format with the top level identifier 'results', and attributes code, title, definition, score and comment for each matching document, where score is a number between 0 and 1 indicating how close the match is to the job description, with 1 meaning really close, and comment explains why this particular documents was selected as a good match."))

@ -0,0 +1,139 @@
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import Chroma
from langchain_community import embeddings
from langchain_community.chat_models import ChatOllama
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain.text_splitter import CharacterTextSplitter
from langchain.docstore.document import Document
import csv
noc_data = []
with open('data/noc.csv', newline='') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
record = {
'code': row['Code - NOC 2021 V1.0'],
'title': row['Class title'],
'definition': row['Class definition']
}
noc_data.append(record)
# with 822 docs nice -n 19 python3 ollama-rag-example.py 7.96s user 1.58s system 13% cpu 1:12.98 total
# with 105 docs: nice -n 19 python3 ollama-rag-example.py 3.17s user 2.18s system 8% cpu 1:00.42 total
def include_page(page):
return True # page['code'].startswith('2')
def to_page_content(page):
return 'code="' + page['code'] + '" title="' + page['title'] + '" definition="' + page['definition'] + '"'
docs = [[Document(page_content=to_page_content(page)) for page in noc_data if include_page(page)]]
# Sources
# https://www.youtube.com/watch?v=jENqvjpkwmw
model_local = ChatOllama(model="mistral")
# 1. Split data into chucks
urls = [
"https://ollama.com",
"https://ollama.com/blog/windows-preview",
"https://ollama.com/blog/openai-compatibility",
]
# docs = [WebBaseLoader(url).load() for url in urls];
flattened_docs = [item for sublist in docs for item in sublist]
print(flattened_docs)
print('total documents included = ', len(flattened_docs))
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(chunk_size=7500, chunk_overlap=100)
doc_splits = text_splitter.split_documents(flattened_docs)
# 2. Convert documents to Embeddings and store them
vectorstore = Chroma.from_documents(
documents=doc_splits,
collection_name="rag-chroma",
embedding=embeddings.ollama.OllamaEmbeddings(model='nomic-embed-text')
)
retriever = vectorstore.as_retriever()
# 3. Before RAG
if False:
print("Before RAG\n")
before_rag_template = "What is {topic}"
before_rag_prompt = ChatPromptTemplate.from_template(before_rag_template)
before_rag_chain = before_rag_prompt | model_local | StrOutputParser()
print(before_rag_chain.invoke({"topic" : "trademark agents"}))
# 4. After rAG
print("\n###########\nAfter RAG")
after_rag_template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
after_rag_prompt = ChatPromptTemplate.from_template(after_rag_template)
after_rag_chain = (
{"context": retriever, "question": RunnablePassthrough()}
| after_rag_prompt
| model_local
| StrOutputParser()
)
admin_assistant_js = ("As an Administrative Assistant in the film industry, you will play a pivotal " +
"role in supporting the administrative and organizational functions of film " +
"production companies, studios, or related entities. You will be responsible for " +
" providing comprehensive administrative support to ensure smooth operations and " +
" facilitate the execution of various projects within the dynamic and fast-paced " +
" environment of the film industry.'");
hopital_chef_jd = ("Title: Hospital Chef\n" +
"Job Summary:\n" +
"As a Hospital Chef, you will play a vital role in ensuring the provision of high-quality, nutritious meals for patients, \n" +
"staff, and visitors in a healthcare setting. Working closely with dietitians, nutritionists, and culinary staff, you\n" +
" will be responsible for planning, preparing, and overseeing the production of meals that meet dietary requirements,\n" +
" taste preferences, and nutritional standards.\n" +
"Responsibilities:\n" +
"1. **Menu Planning:** Collaborate with dietitians and nutritionists to plan menus that meet the dietary needs of patients\n" +
" while adhering to medical guidelines and dietary restrictions.\n" +
"2. **Food Preparation:** Prepare and cook meals according to standardized recipes, ensuring consistency in taste, \n" +
"presentation, and portion sizes. Monitor food quality and taste to maintain high standards.\n" +
"3. **Nutritional Considerations:** Ensure that meals are balanced, nutritious, and appropriate for patients with\n" +
" specific medical conditions or dietary restrictions, such as diabetes, food allergies, or heart disease.\n" +
"4. **Food Safety and Sanitation:** Adhere to strict food safety and sanitation protocols to prevent contamination and \n" +
"ensure compliance with health regulations. Monitor kitchen hygiene, equipment maintenance, and food storage practices.\n" +
"5. **Inventory Management:** Oversee inventory levels of food and kitchen supplies, ordering ingredients and supplies \n" +
"as needed to maintain stock levels and minimize waste. Monitor food costs and budgetary constraints.\n" +
"6. **Team Leadership:** Supervise kitchen staff, including cooks, sous chefs, and kitchen assistants, providing \n" +
"guidance, training, and support to ensure efficient operations and teamwork.\n" +
"7. **Special Dietary Needs:** Accommodate special dietary requests from patients, staff, and visitors, \n" +
"including vegetarian, vegan, gluten-free, and other dietary preferences or restrictions.\n" +
"8. **Menu Development:** Continuously evaluate and update menus to incorporate seasonal ingredients, culinary\n" +
" trends, and feedback from patients and staff. Introduce new recipes and dishes to enhance the dining experience.\n" +
"9. **Patient Satisfaction:** Solicit feedback from patients and staff regarding meal quality, preferences,\n" +
" and satisfaction. Implement improvements and adjustments based on feedback to enhance the overall dining experience.\n" +
"10. **Regulatory Compliance:** Ensure compliance with regulatory agencies, such as the Department of Health, Joint Commission, and local health authorities")
geological_engineer_jd = ("""Title: Geological Engineer
Job Summary:
As a Geological Engineer, you will play a crucial role in assessing the geological conditions of sites and providing expertise in engineering projects related to natural resources exploration, environmental protection, infrastructure development, and hazard mitigation. Your responsibilities will involve analyzing geological data, conducting field surveys, and collaborating with multidisciplinary teams to ensure the safe and efficient execution of engineering projects.
Responsibilities:
1. **Site Investigation:** Conduct geological surveys and site investigations to assess geological features, including rock formations, soil composition, groundwater conditions, and potential hazards such as landslides, earthquakes, or sinkholes.
2. **Geological Mapping:** Create detailed geological maps and models using specialized software and mapping techniques to identify geological structures, mineral deposits, and potential risks for engineering projects.
3. **Geotechnical Analysis:** Perform geotechnical analyses to evaluate soil stability, bearing capacity, and slope stability for the design and construction of infrastructure projects, such as buildings, bridges, roads, and dams.
4. **Risk Assessment:** Assess geological risks and hazards associated with engineering projects, including seismic activity, soil erosion, groundwater contamination, and geological instabilities, and develop mitigation strategies to minimize risks.
5. **Environmental Impact Assessment:** Evaluate the environmental impact of engineering activities on natural ecosystems, water resources, and air quality, and recommend measures to mitigate negative impacts and ensure compliance with environmental regulations.
6. **Resource Exploration:** Assist in the exploration and extraction of natural resources, such as minerals, oil, gas, and water, by analyzing geological data, conducting drilling surveys, and identifying potential resource reserves.
7. **Project Planning:** Provide geological input and expertise during the planning and design phases of engineering projects, including site selection, foundation design, and construction techniques, to optimize project outcomes and minimize geological risks.
8. **Data Analysis:** Analyze geological data collected from field surveys, laboratory tests, and remote sensing""")
print(after_rag_chain.invoke("What are the three documents that most closely match this job description: '" + geological_engineer_jd + "'. Answer in JSON format with the top level identifier 'results', and attributes code, title, definition, score and comment for each matching document, where score is a number between 0 and 1 indicating how close the match is to the job description, with 1 meaning really close, and comment explains why each documents was selected as a good match."))
Loading…
Cancel
Save