# PHBR

Patient Harmonic-mean Best Rank (PHBR) is a metric that represents how well the HLA genotype of an individual can present a specific mutation

## Usage

```
phbr.py --mhc-predictions predictions.tsv \
        --homozygous-loci A,C \
        --output-file output.tsv \
        --mhci
```

- *mhc-predictions* - A TSV file including a column with a peptide sequence,
                    allele, sub-sequence, and score for the sub-sequence
- *homozygous-loci* - A comma-separate list of loci that are homozygous.
- *output-file* - Name of the TSV output file.
- *mhci* - Calculate PHBR for MHC-I.
- *mhcii* - Calculate PHBR for MHC-II.

Given the inputs above, this tool will go through the MHC predictions,
determine the best peptide for each allele, and calculate the harmonic mean.
For class I, a total of 6 alleles are expected, unless homozygous loci (A, B, or C) are listed.
For class II, a total of 8 alleles are expected, unless homozygous loci (DPA, DPB, DQA, DQB, or DRB) are listed.

### Formatting input from the IEDB T cell class I or II tools

```
mhc2phbr.py --peptide-output peptide-output.tsv \
            --sequence-output sequence-output.tsv \
            --phbr-input mhc-predictions.tsv
```

- *peptide-output* - TSV output from the IEDB TC1 or TC2 tool
- *sequence-output* - TSV sequence output from the IEDB TC1 or TC2 tool
- *phbr-input* - File generated for input to PHBR 

Optional parameters:
- *seqnum-colname* - Name of the peptide table column to pull the sequence number from (default: seq #)
- *peptide-colname* - Name of the peptide table column to pull the sub-peptide from (default: peptide)
- *allele-colname* - Name of the peptide table column to pull the allele name from (default: allele)
- *rank-colname* - Name of the peptide table column to pull the rank from (default: netmhcpan_el percentile)
- *sequence-sequnum-colname* - Name of the sequence table column to pull the sequence number from (default: seq #)
- *sequence-sequence-colname* - Name of the sequence table column to pull the peptide sequence from (default: sequence)
- *sequence-mutation-position-colname* - Name of the sequence table column to pull a comma-separated list of mutation start and end positions. If this is not specified, the --mutation-position can be specified separately.
- *mutation-position* - A single set of comma-separated mutation start and end position to be used across all sequences. Alternatively, specify --sequence-mutation-position-colname for peptide-specific positions. Use 'all' to indicate the mutation spans the entire sequence.
- *keep-unmutated* - Keep sub-peptides that do not contain the mutation (default: False)

#### Filtering sub-peptides that do not contain the mutation

Three parameters are related to filtering sub-peptides that do not overlap the position of the mutation in the peptide. By default,
the central position of the peptide will be assumed to be the mutation position.  Any sub-peptides that do not overlap this position
will be filtered from the output.  To change this behavior, you may use one or more of the following parameters:
`sequence-mutation-position-colname, mutation-position, keep-unmutated`.  See their descriptions above for more information.

When using 'all' as the mutation position value, the mutation is considered to span the entire sequence, so all sub-peptides will be retained regardless of their position within the sequence.

#### Examples

To run with default parameters, using the outputs of IEDB T cell class I predictor:
```
python mhc2phbr.py \
--peptide-output examples/peptide_table.tsv \
--sequence-output examples/sequence_table.tsv \
--phbr-input examples/test-phbr.tsv
```


To use the column named 'mutpos' as the mutation position:
```
python mhc2phbr.py \
--peptide-output examples/peptide_table.tsv \
--sequence-output examples/sequence_table_mutpos.tsv \
--sequence-mutation-position-colname mutpos \
--phbr-input examples/test-phbr.tsv
```


To use a fixed mutation position of 5 for all peptides:
```
python mhc2phbr.py \
--peptide-output examples/peptide_table.tsv \
--sequence-output examples/sequence_table.tsv \
--mutation-position 5 \
--phbr-input examples/test-phbr.tsv
```