PRS.jl Documentation

Implementation of ldpred, EB, and lassosum in Julia.

PlinkReader

PRS.PlinkReaderMethod
PlinkReader(path; markerIndex = false, sampleIndex = false)

Structure to access PLINK .bed, .bim, and .fam files at path.

Arguments

  • markerIndex::Bool should index marker -> idx be created?
  • sampleIndex::Bool should index sample -> idx be created?
source
PRS.samplesMethod
samples(p::PlinkReader)

Return Array of Sample structs for samples in Plink file.

source
PRS.sample_indexMethod
sample_index(p::PlinkReader, iid::String)

Return index of sample iid in PlinkReader. Uses index if available, otherwise linear search.

source
PRS.markersMethod
markers(p::PlinkReader)

Return array of Marker structs for markers in Plink file.

source
PRS.marker_indexMethod
marker_index(p::PlinkReader, id::String)

Return index of marker id in PlinkReader. Uses index if available, otherwise linear search.

source
PRS.markersDFMethod
markersDF(p::PlinkReader)

Get data frame with marker info.

Returns data frame with columns

  • Chrom
  • Name
  • cM
  • Pos
  • A1
  • A2
  • Idx
source
Base.getindexMethod
getindex(p::PlinkReader, s::Int, m::Int)

Retrieve genotype of sample s and marker m.

Plink convention for representation of genotypes is as follows:

  • 0b00 Hom1
  • 0b01 Het
  • 0b10 missing
  • 0b11 Hom2
source
PRS.dosageMatrixFunction
dosageMatrix(p::PlinkReader, markerIdx, sampleIdx = nothing;
               normalize = true)

Create dosage matrix (samples x markers) for markerIdx and sampleIdx from PlinkReader. Here dosage is expressed as count of alternative allele.

Replace missing genotypes with mean dosage. If normalize = true, normalize markers to mean μ = 0 and standard deviation σ = 1.

source

LDMatrix

PRS.LDMatrixMethod
LDMatrix(p::PlinkReader, mIdx0, mIdx1; alpha = 0.9, window = posWindow(1_000_000))

Compute LDMatrix from genotypes in Plink file.

source
PRS.LDMatrixMethod
LDMatrix(path::AbstractString)

Load LDMatrix from .bim and .lds files at path.

source
PRS.markersMethod
markers(ld::LDMatrix)

Get array of Marker structs for all markers in LDMatrix ld.

source
PRS.markersDFMethod
markersDF(ld::LDMatrix)

Get data frame with marker info.

Returns data frame with columns

  • Chrom
  • Name
  • cM
  • Pos
  • A1
  • A2
  • Idx
source
PRS.saveMethod
save(ld::LDMatrix, path::AbstractString)

Save LDMatrix to path.

source
PRS.ldscoreMethod
ldscore(ld::LDMatrix)

Compute LD scores for markers in LDMatrix ld.

source

ldpred

PRS.ldpred_gibbsMethod
ldpred_gibbs(z, D, p, σ2, μ0; n_burnin = 100, n_iter = 500, verbose = false)

Run LDpred Gibbs sampler.

Arguments

  • z::Vector: z-scores
  • D::LDMatrix: LD matrix
  • p::Real : proportion of variants deemed to be causal
  • σ2::Real: $Nh²/M$ as estimated from LDscore regression (estimate_h2, or mean(z^2)/mean(lds)) Note: prior variance of non-null component of $μ$, $σ2_μ = σ2/p$
  • μ0::Vector: starting estimate (e.g. from infinite model)
source
PRS.estimate_h2Method
estimate_h2(z, lds, n)

Estimate total heritability $h^2$ for markers using LD score regression with intercept forced to 1.

Arguments

  • lds::Vector: LD score as computed by ldscore(LDMatrix)
  • n::Integer: effective number of samples ($4 n_0*n_1/(n_0+n_1)$ for case-control)
source
PRS.estimate_neffMethod
estimate_neff(beta, sebeta, freq)

Estimate effective number of samples in case-control study.

Effective number of samples is total number of samples for a cohort with same number of samples and controls, that would result in the observed sebeta.

freq can be minor or major allele freq, as only freq*(1-freq) is being used.

source

lassosum

PRS.z2corMethod
z2cor(z, n)

Convert GWAS z-value to phenotype-genotype correlation coefficient.

source
Missing docstring.

Missing docstring for elnetg!(X, r, λ, s, β = zeros(Float64, length(r)); maxiter = 10_000, thresh = 1e-4). Check Documenter's build log for details.

PRS.elnetg_pathMethod
elnetg_path(X, r, λs, s;
            maxiter = 10_000, thresh = 1e-4)

Solve elastic net for path along λs using warm starts.

Arguments

  • X::Matrix nsubj x nmarkers normalized genotype matrix column normalized (μ = 0, σ = 1)
  • r::Vector nmarkers x 1 vector of correlation coefficients between phenotype and genotypes
  • λs::Vector shrinkage parameter path for 1-norm in decreasing order
  • s::Real shrinkage parameter for 2-norm (LD)
  • β::Vector warm start (in) and result (out) for solution vector
  • maxiter::Int maximum number of iterations
  • thresh::Real maximum change in β to be called converged
source

utils

PRS.snp_joinMethod
snp_join(df1, df2; on = [:Chrom, :Pos],
         alleles1 = [:A1_1, :A2_1], alleles2 = [:A1_2, :A2_2],
         matchcol = :sign)

Join data frames df1 and df2 and match variants.

Data frames df1 and df2 are first matched by columns specified in on. Then column matchcol is set to

  • matchcol == 1 if values for alleles1 and alleles2 match
  • matchcol == -1 if values for alleles1 and alleles2 are swapped
  • matchcol == 0 otherwise

Note: Swapped means simply swapping of alleles, not applying reverse complement as done in other implementations.

source