Methodology

Point-in-Time Accuracy

Financial backtests fail when data is available too early — you use information that wasn't publicly known at the time. Valuein timestamps every fact with accepted_at: the exact moment the SEC accepted the filing. Filter by it and your backtest is safe.

report_date

Fiscal period end

filing_date

Date filed with SEC

accepted_at

SEC acceptance timestamp

Why Most Databases Introduce Look-Ahead Bias

Most financial databases store data as it exists today, not as it was known historically. A fiscal year 2021 annual report filed in March 2022 is often backdated to December 31, 2021 — making it appear as if that data was available before it was. Backtests built on this data are invalid.

Timeline for a 2022 10-K Filing

Dec 31, 2022

report_date

Fiscal period ends. Results for the full year are computed internally. The public knows nothing yet.

Feb 15, 2023

filing_date

Company submits the 10-K to SEC EDGAR. Still not indexed in full — processing occurs over hours.

Feb 15, 2023 17:42 UTC

accepted_at

SEC accepts and timestamps the filing. This is the earliest moment any investor could have seen this data. Filter by this.

The hidden trap

If a data vendor stores this filing against fiscal_year = 2022 with no timestamp, a backtest that says "use 2022 annual data as of Jan 1, 2023" will include it — but the filing wasn't accepted until Feb 15, 2023. Your simulated portfolio used information that didn't exist yet. This introduces look-ahead bias and inflates backtest performance.

The Three Key Fields

accepted_at

fact

TIMESTAMPTZ

The exact UTC timestamp the SEC accepted the filing that disclosed this fact. Each fact row inherits accepted_at from its parent filing on indexing, so the filter works identically on either table. This is your PIT anchor — use it exclusively for backtest-safe queries. It represents the earliest moment any investor could have read this data.

Always filter: WHERE accepted_at <= your_date

filing_date

filing

DATE

The date the SEC received the filing. Very close to accepted_at but lacks the exact time component. Suitable for rough date-range filtering but accepted_at is more precise for PIT analysis.

Safe for range filtering, less precise than accepted_at

report_date

filing

DATE

The fiscal period end date (e.g. December 31 for a calendar-year company). This is NOT a PIT field — using it as a filter introduces look-ahead bias because the data wasn't known until the filing date weeks or months later.

For display purposes only — never use as a PIT filter

Wrong vs. Right Queries

The difference between a biased and a valid backtest often comes down to a single WHERE clause.

Wrong — look-ahead bias

-- WRONG: look-ahead bias introduced
-- This returns data as if you knew it on Jan 1 2022,
-- but 10-K filings for fiscal year 2021 weren't published
-- until Feb–March 2022. You're using future information.
SELECT
  r.symbol,
  fa.numeric_value / 1e9 AS revenue_billions
FROM references r
JOIN filing f  ON f.entity_id = r.cik
JOIN fact fa   ON fa.accession_id = f.accession_id
WHERE fa.standard_concept = 'Revenues'
  AND f.fiscal_year = 2021          -- WRONG: fiscal year is NOT when data was known
  AND f.form_type   = '10-K'
ORDER BY revenue_billions DESC;

Right — PIT-safe

-- RIGHT: point-in-time safe using accepted_at
-- Only returns data that was publicly available on 2022-01-01.
-- If a company filed its 2020 10-K late (e.g. Feb 2022),
-- it will NOT appear in this query -- correct behavior.
SELECT
  r.symbol,
  fa.numeric_value / 1e9 AS revenue_billions,
  fa.accepted_at                            -- visible timestamp
FROM references r
JOIN filing f  ON f.entity_id = r.cik
JOIN fact fa   ON fa.accession_id = f.accession_id
WHERE fa.standard_concept = 'Revenues'
  AND f.form_type         = '10-K'
  AND fa.accepted_at     <= '2022-01-01'   -- RIGHT: PIT filter
ORDER BY revenue_billions DESC;

Survivorship Bias

Look-ahead bias is temporal — using today's data in the past. Survivorship bias is structural — only analyzing companies that still exist today. Both inflate backtest returns and both are invisible unless your dataset is specifically built to prevent them.

🏚️

Delisted companies

Valuein tracks all entities including those that were delisted, acquired, or went bankrupt. The Pro and Institutional tiers include ~18,000 entities — active and inactive.

📅

Historical index membership

The index_membership table records exact effective_date / removal_date for each company in each index, with [) interval semantics. A 2010 S&P500 backtest uses the 2010 constituents, not today's.

🌐

PIT universe construction

Use get_pit_universe(as_of_date) to reconstruct the exact investable universe on any historical date — free of additions that happened after.

Survivorship-bias-free universe construction (SQL)

-- Build a survivorship-bias-free universe for March 2020
-- This returns exactly who was in the S&P500 on that date --
-- before COVID additions/removals, before failures, before mergers.
SELECT
  cik,
  ticker,
  name,
  sector
FROM get_pit_universe(
  as_of_date => '2020-03-01',
  index       => 'SP500'
);

-- WRONG alternative (survivorship bias):
-- Using the current S&P500 list for 2020 data excludes
-- companies that were dropped and includes companies that
-- didn't exist in the index yet.

PIT in the Python SDK

Every SDK method that returns time-series data accepts an as_of_date parameter. Pass it to transparently filter by accepted_at.

from valuein_sdk import ValueinClient, ValueinError

try:
    with ValueinClient() as client:

        # PIT-safe: only data known as of the backtest date
        df = client.run_query("""
            SELECT r.symbol, fa.fiscal_year,
                   fa.numeric_value / 1e9 AS revenue_bn,
                   fa.accepted_at
            FROM fact fa
            JOIN references r ON fa.entity_id = r.cik
            WHERE r.symbol              = 'AAPL'
              AND fa.standard_concept   = 'TotalRevenue'
              AND fa.fiscal_period      = 'FY'
              AND fa.accepted_at      <= '2023-01-01'   -- PIT filter
            ORDER BY fa.fiscal_year DESC
            LIMIT 10
        """)

        # All rows have accepted_at <= 2023-01-01
        print(df[["fiscal_year", "revenue_bn", "accepted_at"]])

except ValueinError as e:
    print(f"Error: {e}")

Frequently Asked Questions

Why is accepted_at sometimes later than filing_date?

The SEC processes filings asynchronously. A filing submitted on February 14 may not receive its EDGAR acceptance timestamp until late that evening or early the next day. accepted_at captures the exact millisecond of acceptance — always later than or equal to the submission time.

Can I trust filing_date for PIT backtests?

It's usable for rough filtering but accepted_at is strictly more accurate. Some data providers conflate the two. In Valuein's schema, filing_date is a DATE (day precision) and accepted_at is a TIMESTAMPTZ (millisecond precision). For production backtests, always use accepted_at.

How do I handle the fact that Q2 and Q3 10-Q cash flow figures are year-to-date?

Use COALESCE(derived_quarterly_value, numeric_value) on cash flow concepts. The pipeline computes derived_quarterly_value for Q2 and Q3 by subtracting the prior period YTD. This makes all quarters directly comparable without manual adjustments.

Does the sample tier support PIT queries?

Yes. The sample tier includes accepted_at on all fact rows. You can build and validate PIT query patterns on free sample data before upgrading to sp500 or full.

Ready to build a PIT-safe backtest?

Start with the free sample tier — all PIT fields are included. Upgrade for full S&P500 history or the complete 16,000+ ticker universe.

PIT Universe Tool Python SDK Guide