Skip to main content

Installation

Requirements

  • Python ≥ 3.9
  • MongoDB (local or via Docker)
  • A running LAMAPI entity retrieval service

Install from Source

git clone https://github.com/unimib-datAI/alligator-emd.git
cd alligator-emd
pip install -e .

Optional Extras

# FastAPI REST backend
pip install -e ".[app]"

# Development tools (pytest, black, mypy, etc.)
pip install -e ".[dev]"

Docker Compose

The repository ships a ready-to-use docker-compose.yml that starts both the Alligator backend and MongoDB:

docker compose up

For debug mode with Node.js inspector:

docker compose -f docker-compose.debug.yml up

Environment Variables

Create a .env file in the project root (a sample is provided):

VariableDescription
ENTITY_RETRIEVAL_ENDPOINTLAMAPI entity lookup endpoint URL
ENTITY_RETRIEVAL_TOKENAuth token for the retrieval API
OBJECT_RETRIEVAL_ENDPOINTObject relationship endpoint URL
LITERAL_RETRIEVAL_ENDPOINTLiteral values endpoint URL
MONGO_URIMongoDB connection URI (default: mongodb://localhost:27017/)

Match Threshold Variables

These control the ML match decision behaviour (see Scoring & Thresholds):

VariableDefaultDescription
RAW_MIN_CONFIDENCE0.1Minimum raw ML confidence required for auto-matching
MATCH_THRESHOLD0.5Minimum normalised score for the top candidate to be accepted
MATCH_MARGIN_DELTA0.1Accept if top candidate leads the second candidate by at least this margin