langchain/.github/CONTRIBUTING.md
Brigit Murtaugh ccd916babe
Update dev container (#6189)
Fixes https://github.com/hwchase17/langchain/issues/6172

As described in https://github.com/hwchase17/langchain/issues/6172, I'd
love to help update the dev container in this project.

**Summary of changes:**
- Dev container now builds (the current container in this repo won't
build for me)
- Dockerfile updates
- Update image to our [currently-maintained Python
image](https://github.com/devcontainers/images/tree/main/src/python/.devcontainer)
(`mcr.microsoft.com/devcontainers/python`) rather than the deprecated
image from vscode-dev-containers
- Move Dockerfile to root of repo - in order for `COPY` to work
properly, it needs the files (in this case, `pyproject.toml` and
`poetry.toml`) in the same directory
- devcontainer.json updates
- Removed `customizations` and `remoteUser` since they should be covered
by the updated image in the Dockerfile
     - Update comments
- Update docker-compose.yaml to properly point to updated Dockerfile
- Add a .gitattributes to avoid line ending conversions, which can
result in hundreds of pending changes
([info](https://code.visualstudio.com/docs/devcontainers/tips-and-tricks#_resolving-git-line-ending-issues-in-containers-resulting-in-many-modified-files))
- Add a README in the .devcontainer folder and info on the dev container
in the contributing.md

**Outstanding questions:**
- Is it expected for `poetry install` to take some time? It takes about
30 minutes for this dev container to finish building in a Codespace, but
a user should only have to experience this once. Through some online
investigation, this doesn't seem unusual
- Versions of poetry newer than 1.3.2 failed every time - based on some
of the guidance in contributing.md and other online resources, it seemed
changing poetry versions might be a good solution. 1.3.2 is from Jan
2023

---------

Co-authored-by: bamurtaugh <brmurtau@microsoft.com>
Co-authored-by: Samruddhi Khandale <samruddhikhandale@github.com>
2023-06-16 15:42:14 -07:00

9.3 KiB

Contributing to LangChain

Hi there! Thank you for even being interested in contributing to LangChain. As an open source project in a rapidly developing field, we are extremely open to contributions, whether they be in the form of new features, improved infra, better documentation, or bug fixes.

🗺️ Guidelines

👩‍💻 Contributing Code

To contribute to this project, please follow a "fork and pull request" workflow. Please do not try to push directly to this repo unless you are maintainer.

Please follow the checked-in pull request template when opening pull requests. Note related issues and tag relevant maintainers.

Pull requests cannot land without passing the formatting, linting and testing checks first. See Common Tasks for how to run these checks locally.

It's essential that we maintain great documentation and testing. If you:

  • Fix a bug
    • Add a relevant unit or integration test when possible. These live in tests/unit_tests and tests/integration_tests.
  • Make an improvement
    • Update any affected example notebooks and documentation. These lives in docs.
    • Update unit and integration tests when relevant.
  • Add a feature
    • Add a demo notebook in docs/modules.
    • Add unit and integration tests.

We're a small, building-oriented team. If there's something you'd like to add or change, opening a pull request is the best way to get our attention.

🚩GitHub Issues

Our issues page is kept up to date with bugs, improvements, and feature requests.

There is a taxonomy of labels to help with sorting and discovery of issues of interest. Please use these to help organize issues.

If you start working on an issue, please assign it to yourself.

If you are adding an issue, please try to keep it focused on a single, modular bug/improvement/feature. If two issues are related, or blocking, please link them rather than combining them.

We will try to keep these issues as up to date as possible, though with the rapid rate of develop in this field some may get out of date. If you notice this happening, please let us know.

🙋Getting Help

Our goal is to have the simplest developer setup possible. Should you experience any difficulty getting setup, please contact a maintainer! Not only do we want to help get you unblocked, but we also want to make sure that the process is smooth for future contributors.

In a similar vein, we do enforce certain linting, formatting, and documentation standards in the codebase. If you are finding these difficult (or even just annoying) to work with, feel free to contact a maintainer for help - we do not want these to get in the way of getting good code into the codebase.

🚀 Quick Start

Note: You can run this repository locally (which is described below) or in a development container (which is described in the .devcontainer folder).

This project uses Poetry as a dependency manager. Check out Poetry's documentation on how to install it on your system before proceeding.

Note: If you use Conda or Pyenv as your environment / package manager, avoid dependency conflicts by doing the following first:

  1. Before installing Poetry, create and activate a new Conda env (e.g. conda create -n langchain python=3.9)
  2. Install Poetry (see above)
  3. Tell Poetry to use the virtualenv python environment (poetry config virtualenvs.prefer-active-python true)
  4. Continue with the following steps.

To install requirements:

poetry install -E all

This will install all requirements for running the package, examples, linting, formatting, tests, and coverage. Note the -E all flag will install all optional dependencies necessary for integration testing.

Note: If you're running Poetry 1.4.1 and receive a WheelFileValidationError for debugpy during installation, you can try either downgrading to Poetry 1.4.0 or disabling "modern installation" (poetry config installer.modern-installation false) and re-install requirements. See this debugpy issue for more details.

Now, you should be able to run the common tasks in the following section. To double check, run make test, all tests should pass. If they don't you may need to pip install additional dependencies, such as numexpr and openapi_schema_pydantic.

Common Tasks

Type make for a list of common tasks.

Code Formatting

Formatting for this project is done via a combination of Black and isort.

To run formatting for this project:

make format

Linting

Linting for this project is done via a combination of Black, isort, flake8, and mypy.

To run linting for this project:

make lint

We recognize linting can be annoying - if you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.

Coverage

Code coverage (i.e. the amount of code that is covered by unit tests) helps identify areas of the code that are potentially more or less brittle.

To get a report of current coverage, run the following:

make coverage

Working with Optional Dependencies

Langchain relies heavily on optional dependencies to keep the Langchain package lightweight.

If you're adding a new dependency to Langchain, assume that it will be an optional dependency, and that most users won't have it installed.

Users that do not have the dependency installed should be able to import your code without any side effects (no warnings, no errors, no exceptions).

To introduce the dependency to the pyproject.toml file correctly, please do the following:

  1. Add the dependency to the main group as an optional dependency
poetry add --optional [package_name]
  1. Open pyproject.toml and add the dependency to the extended_testing extra
  2. Relock the poetry file to update the extra.
poetry lock --no-update
  1. Add a unit test that the very least attempts to import the new code. Ideally the unit test makes use of lightweight fixtures to test the logic of the code.
  2. Please use the @pytest.mark.requires(package_name) decorator for any tests that require the dependency.

Testing

See section about optional dependencies.

Unit Tests

Unit tests cover modular logic that does not require calls to outside APIs.

To run unit tests:

make test

To run unit tests in Docker:

make docker_tests

If you add new logic, please add a unit test.

Integration Tests

Integration tests cover logic that requires making calls to outside APIs (often integration with other services).

warning Almost no tests should be integration tests.

Tests that require making network connections make it difficult for other developers to test the code.

Instead favor relying on responses library and/or mock.patch to mock requests using small fixtures.

To run integration tests:

make integration_tests

If you add support for a new external API, please add a new integration test.

Adding a Jupyter Notebook

If you are adding a Jupyter notebook example, you'll want to install the optional dev dependencies.

To install dev dependencies:

poetry install --with dev

Launch a notebook:

poetry run jupyter notebook

When you run poetry install, the langchain package is installed as editable in the virtualenv, so your new logic can be imported into the notebook.

Documentation

Contribute Documentation

Docs are largely autogenerated by sphinx from the code.

For that reason, we ask that you add good documentation to all classes and methods.

Similar to linting, we recognize documentation can be annoying. If you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.

Build Documentation Locally

Before building the documentation, it is always a good idea to clean the build directory:

make docs_clean

Next, you can run the linkchecker to make sure all links are valid:

make docs_linkcheck

Finally, you can build the documentation as outlined below:

make docs_build

🏭 Release Process

As of now, LangChain has an ad hoc release process: releases are cut with high frequency by a developer and published to PyPI.

LangChain follows the semver versioning standard. However, as pre-1.0 software, even patch releases may contain non-backwards-compatible changes.

🌟 Recognition

If your contribution has made its way into a release, we will want to give you credit on Twitter (only if you want though)! If you have a Twitter account you would like us to mention, please let us know in the PR or in another manner.