python · data engineering

alternative data platform — indian equities

find information that moves stocks in the physical world, before it hits financial statements.

python alternative data medallion architecture point-in-time feature store fastapi prefect timescaledb backtesting quant

what it is

most market signals show up in a company's numbers after the fact. alternative data is the bet that you can see the same thing earlier — in power demand, shipping, footfall, satellite imagery — if you can structure it. this platform is built to do that for indian equities.

it's a scalable skeleton with hard layer contracts plus one fully working vertical slice (posoco power demand). adding the other sources is a plug-in, not a rewrite — copy a template, implement three methods, add a feature recipe, and nothing downstream changes.

the medallion pipeline

sources → ingest — pluggable data sources behind one abstract contract
bronze — raw, untouched, in object storage (minio)
silver — normalized records in postgres
gold — a point-in-time feature store
signals → backtest → delivery — factor models, an indian cost model, fastapi + grafana

the rule that runs through everything

point-in-time correctness is structural, not a checkbox. every feature carries (feature_date, published_date, as_of_date), and every read goes through one function that filters to what was actually knowable at the as-of moment. there's simply no api to see the future — which is the single most common way a backtest lies to you.

pythonlanguage
posocoworking vertical slice
minio + pgstorage
fastapidelivery
i wanted to build the boring, correct part first — the part that stops you fooling yourself — and only then add sources. the anti-lookahead feature store is the whole point.

built by dharun ashokkumar