How the FIFO capital gains calculation works
This page explains the concepts behind the FIFO capital-gains engine
implemented in irs_asset_fifo_calculator.
It is meant as a human-readable overview of what the code is doing, without
diving into implementation details.
Overview
The goal of this project is to compute IRS Form 8949-style rows for trades in stocks or other assets, using a First In, First Out (FIFO) cost-basis method.
You start with a CSV of raw transactions (asset_tx.csv). The library:
Groups low-level CSV rows into logical transaction blocks using
Tx IndexandType(Buy, Sell, Exchange, Transfer).For each block, extracts three conceptual pieces:
Buy side (
buy_data)Sell side (
sell_data)Fee side (
fee_data)
Maintains a FIFO queue of lots for each asset.
On sales (and some fees), it consumes those lots in order and writes Form 8949 rows describing each realized gain or loss.
The entry point for this logic is:
run_fifo_pipeline(df: pd.DataFrame) -> list[dict[str, str]]
(pure data-in / data-out)main(...)in the CLI wrapper just handles reading/writing CSV and callsrun_fifo_pipeline.
FIFO logic
Buy, Sell, or Exchange
Every time a purchase, sale, or exchange is made, the holdings of one asset will probably increase, and the holdings of another asset will decrease. The changes in these assets must be documented, calculating proceeds minus cost basis for tax calculation purposes. Purchases are added to the FIFO ledger, and sales are deducted from the ledger starting with the oldest lots. The only exception to these rules are for USD, which is not tracked since it does not generate gains or losses.
Fees may be incurred for these transactions and can be deducted. Three cases may occur:
If the fee asset is the same as the bought asset, then the fee amount is directly deducted from the amount of the bought asset that is added to the FIFO ledger. This deduction also affects the cost basis of the bought asset, increasing it.
If the fee asset is the same as the sold asset, then it is directly added to the amount of the sold asset that is used to update the FIFO ledger. This deduction also affects the proceeds of the sold asset, decreasing it.
If the fee asset is different from both the buy and sell assets, then it is treated as a sale, and the FIFO ledger is updated accordingly, reducing the oldest lot for that asset.
Fees for each transaction can be deducted in the buy asset, the sell asset, and in one more asset. Two fee assets that are neither the buy nor sell asset will result in an error.
Transfer
For transfers, the fees are not deducted, but are taken into account when updated the FIFO ledger, where they are subtracted from the oldest lot for that asset.
Lots and the FIFO ledger
For each asset (e.g. NVDA or TSLA), the calculator keeps a FIFO queue of
lots:
FifoLot:
amount: float # units of the asset
price: float # unit price in USD
cost: float # total cost in USD (including fees)
tx_date: date # acquisition date
Conceptually:
A buy creates a new lot and appends it to the right end of the queue.
A sell takes units from the left (oldest lot first) and may consume multiple lots.
After a sale, the remaining amount and cost in each lot are updated.
Example
2 purchases are made, followed by a sale:
Buy 10 NVDA at $100 → one lot
Buy 5 NVDA at $110 → second lot
Sell 12 NVDA at $130
When you sell 12 NVDA:
You take 10 from the first lot (oldest).
You take 2 from the second lot.
The second lot now has 3 units remaining.
The realized gain/loss is computed per used lot, and each piece becomes a row on Form 8949.
Transaction blocks (Tx Index and Type)
Each logical transaction in the CSV is represented by one or more rows sharing the same:
Tx Index– integer ID of the block (e.g.0,1,2, …)Type– transaction type for the entire block:"Buy""Sell""Exchange""Transfer"
The pipeline groups the input DataFrame by Tx Index and expects all rows in
a group to have the same Type. If a block contains mixed types, it is
rejected with an error.
Within a block you can have:
Non-fee rows – the actual economic legs, e.g.:
Spending USD, receiving NVDA
Selling NVDA, receiving USD
Swapping NVDA for TSLA
Fee rows – rows where
Assetstarts with"fee"(e.g.feeUSD,feeNVDA,feeTSLA). These represent transaction fees paid in some asset.
A typical Buy block might look like this:
Tx Index |
Date |
Asset |
Amount (asset) |
Sell price ($) |
Buy price ($) |
Type |
|---|---|---|---|---|---|---|
0 |
2024-09-04 |
USD |
-1250.0 |
1.0 |
1.0 |
Buy |
0 |
2024-09-04 |
NVDA |
10.0 |
NaN |
125.0 |
Buy |
0 |
2024-09-04 |
feeUSD |
-10.0 |
1.0 |
NaN |
Buy |
This is interpreted as:
You spend USD (including a USD fee),
You receive NVDA,
Fees are folded into the cost basis of the NVDA lot according to the rules in the parsing logic.
The function run_fifo_pipeline processes each transaction block in turn,
extracts the buy, sell, and fee sides, and then updates the per-asset
FIFO ledger accordingly.
Parsing buy, sell, and fee data
For each transaction block, the function
parse_row_data(block_type, rows)
returns three AssetData structures:
AssetData:
asset: str | None
amount: float
price: float
total: float # cost or proceeds
tx_date: date
Buy side
For a
"Buy"or"Exchange"block, the buy side identifies the row where"Amount (asset)"is positive (after subtracting fee rows in the same asset).totalis the total cost including fees (when fees should be folded into the buy).
Sell side
For a
"Sell"or"Exchange"block, the sell side identifies the row where"Amount (asset)"is negative (after subtracting fee rows).totalis the total proceeds after any adjustments from fees.
Fee side
Any rows whose
"Asset"starts with “fee” are treated as fee rows.Depending on the asset and type of transaction, fees may:
Be added to the cost of the bought asset,
Be subtracted from proceeds of the sold asset, or
Be treated as a separate sale (e.g. for transfers, or when fees are paid in an asset that is not directly the buy or sell asset).
Updating the FIFO ledger
Once buy_data, sell_data, and fee_data have been extracted for a block,
they are applied to the per-asset FIFO queues via update_fifo(...).
At a high level:
Buy updates
If
buy_data.assetis notNoneand not"USD":If
buy_data.amount > 0
→ append a new lot to that asset’s FIFO queue:fifo[buy_data.asset].append({ "amount": buy_data.amount, "price": buy_data.price, "cost": buy_data.total, "tx_date": buy_data.tx_date, })
If
buy_data.amount < 0
→ this means previously calculated fees (in the same asset) exceed the nominal buy amount, so the net effect is a sale of that asset.
In that case, the code callsreduce_fifo(...)to consume existing lots and emit Form 8949 rows:reduce_fifo( form8949, abs(buy_data.amount), buy_data.asset, fifo[buy_data.asset], buy_data.total, buy_data.tx_date, )
Sell updates
If
sell_data.assetis notNone, not"USD", andsell_data.amount < 0:This represents a real sale of a non-USD asset.
The code again callsreduce_fifo(...)to:Consume lots from the left (oldest first),
Compute proportional cost and proceeds,
Write one or more Form 8949 rows via
record_sale(...).
reduce_fifo( form8949, abs(sell_data.amount), sell_data.asset, fifo[sell_data.asset], sell_data.total, sell_data.tx_date, )
Fee updates
If
fee_data.assetis the same asbuy_data.assetorsell_data.asset, those fees must already have been folded intobuy_dataorsell_dataupstream. In that case,update_fiforaises an error to avoid double-counting:if fee_data.asset == buy_data.asset or fee_data.asset == sell_data.asset: raise ValueError( f"Fee asset {fee_data.asset} should already be taken " f"into account in buy {buy_data.asset} or sell " f"{sell_data.asset} asset." )
Otherwise, if
fee_data.assetis a non-USD asset and
fee_data.amount != 0.0, paying the fee in that asset is treated as a taxable event where the fee asset would be sold to obtain dollars, that in turn are used to pay the fee. The fee amount is sold out of the FIFO lots for that asset:if fee_data.asset is not None and fee_data.asset != "USD" and fee_data.amount != 0.0: reduce_fifo( form8949, abs(fee_data.amount), fee_data.asset, fifo[fee_data.asset], fee_data.total, fee_data.tx_date, )
From FIFO movements to Form 8949 rows
Every time a lot (or part of a lot) is used in a sale, the code calls
record_sale(...) to create a Form 8949-style row.
For each portion of a lot that is sold, record_sale receives:
asset: Ticker or symbol (e.g."NVDA").amount: Quantity of the asset sold from that lot.proceeds: Dollar proceeds allocated to this portion of the sale.cost_basis: Dollar cost basis allocated from the FIFO lot.acquisition_date: Original lot acquisition date.sale_date: Date of the sale.
Using that information, it appends a dictionary like this to form8949:
{
"Description": "10.00000000 NVDA",
"Date Acquired": "01/01/2024",
"Date Sold": "09/04/2024",
"Proceeds": "10000.00",
"Cost Basis": "1000.00",
"Gain or Loss": "9000.00",
}
Key points:
One sale can produce multiple rows if it consumes more than one FIFO lot.
"Proceeds"and"Cost Basis"are rounded to cents and stored as strings, matching Form 8949 formatting."Gain or Loss"is computed as"Proceeds" - "Cost Basis":Positive values are written as a plain number (e.g.
"9000.00"),Negative values are wrapped in parentheses (e.g.
"(50.20)"), following IRS conventions.
All such dictionaries collected in form8949 are:
Returned by
run_fifo_pipeline(df)asList[Dict[str, str]], andWritten by
main(...)to “form8949.csv” for import into tax software or manual transcription to Form 8949.
Putting it all together
Here’s the full flow, end to end:
You provide input
You start with a CSV of raw transactions (for example,asset_tx.csv) with columns like:DateTx IndexAssetAmount (asset)Sell price ($)Buy price ($)Type
The pipeline groups rows into blocks
run_fifo_pipeline(df)groups rows by:Tx Index→ which rows belong to the same logical transactionType→"Buy","Sell","Exchange", or"Transfer"
Each block is parsed into three conceptual pieces
For every block,parse_row_data(...)extracts:buy_data→ what you acquired and at what costsell_data→ what you disposed of and for how muchfee_data→ any fees and which asset was used to pay them
FIFO ledgers are updated per asset
update_fifo(...)uses those three pieces to mutate the per-asset FIFO queues:Buys usually append new lots (amount, price, cost, date).
Sells consume existing lots from oldest to newest via
reduce_fifo(...).Fees paid in non-USD assets can also trigger lot reductions (because paying fees in NVDA/TSLA/etc. is itself a taxable event).
Each lot reduction generates Form 8949-style rows
When a sale (or fee paid in-kind) consumes a lot,record_sale(...):Computes the proportional cost basis and proceeds for that slice.
Appends a dictionary with keys like:
"Description""Date Acquired""Date Sold""Proceeds""Cost Basis""Gain or Loss"
The pipeline returns all rows as a list of dicts
run_fifo_pipeline(df)returns a list of these Form 8949-style rows, which can be:Written to CSV,
Further processed in Python,
Or inspected in tests.
The CLI wrapper handles file IO
Themain(...)function is a thin IO layer:Reads the input CSV into a DataFrame.
Calls
run_fifo_pipeline(df)to compute gains and losses.Writes the resulting rows to
form8949.csv.
In short:
All tax logic lives in pure functions like
run_fifo_pipeline,parse_row_data,update_fifo, andreduce_fifo.All file handling is kept in
main(...).