| Back to Long Tests report for BioC 3.22 |
This page was generated on 2025-10-18 23:55 -0400 (Sat, 18 Oct 2025).
| Hostname | OS | Arch (*) | R version | Installed pkgs |
|---|---|---|---|---|
| nebbiolo2 | Linux (Ubuntu 24.04.3 LTS) | x86_64 | 4.5.1 Patched (2025-08-23 r88802) -- "Great Square Root" | 4887 |
| lconway | macOS 12.7.6 Monterey | x86_64 | 4.5.1 Patched (2025-09-10 r88807) -- "Great Square Root" | 4677 |
| Click on any hostname to see more info about the system (e.g. compilers) (*) as reported by 'uname -p', except on Windows and Mac OS X | ||||
| Package 26/31 | Hostname | OS / Arch | CHECK | |||||||
| MungeSumstats 1.17.5 (landing page) Alan Murphy
| nebbiolo2 | Linux (Ubuntu 24.04.3 LTS) / x86_64 | OK | |||||||
| lconway | macOS 12.7.6 Monterey / x86_64 | OK | ||||||||
|
To the developers/maintainers of the MungeSumstats package: - Use the following Renviron settings to reproduce errors and warnings. - If 'R CMD check' started to fail recently on the Linux builder(s) over a missing dependency, add the missing dependency to 'Suggests:' in your DESCRIPTION file. See Renviron.bioc for more information. |
| Package: MungeSumstats |
| Version: 1.17.5 |
| Command: /home/biocbuild/bbs-3.22-bioc/R/bin/R CMD check --test-dir=longtests --no-stop-on-test-error --no-codoc --no-examples --no-manual --ignore-vignettes --check-subdirs=no MungeSumstats_1.17.5.tar.gz |
| StartedAt: 2025-10-18 16:21:53 -0400 (Sat, 18 Oct 2025) |
| EndedAt: 2025-10-18 16:42:09 -0400 (Sat, 18 Oct 2025) |
| EllapsedTime: 1216.5 seconds |
| RetCode: 0 |
| Status: OK |
| CheckDir: MungeSumstats.Rcheck |
| Warnings: 0 |
MungeSumstats.Rcheck/tests/testthat.Rout
R version 4.5.1 Patched (2025-08-23 r88802) -- "Great Square Root"
Copyright (C) 2025 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(testthat)
> library(MungeSumstats)
>
> test_check("MungeSumstats")
Error in get_query_content(.) :
Status code from OpenGWAS API: 401
Message: ERROR - Go to https://api.opengwas.io/ - From 1st May 2024 you must provide a token (JWT) alongside most of your requests. Read more at https://api.opengwas.io/ and also check for the latest version at https://mrcieu.github.io/ieugwasr/
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7463b337206.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74617de8a79
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A0 A1 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A0 A1 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7463b337206.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.069 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7464778b007.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74617de8a79
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7464778b007.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.053 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7462417d256.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74637332183
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A2 A1 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A2 A1 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for correct direction of A1 (reference) and A2 (alternative allele).
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 29 seconds.
There are 47 SNPs where A1 doesn't match the reference genome.
These will be flipped with their effect columns.
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicate SNPs from SNP ID.
Checking for SNPs with duplicated base-pair positions.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
Checking for bi-allelic SNPs.
67 SNPs (72%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7462417d256.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.556 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 G A 0.63060 -0.017 0.003 2.359e-10
3: rs34305371 1 72733610 G A 0.91231 -0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7467aa1b3dd.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74637332183
Checking for empty columns.
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Ensuring all SNPs are on the reference genome.
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
Checking for correct direction of A1 (reference) and A2 (alternative allele).
There are 46 SNPs where A1 doesn't match the reference genome.
These will be flipped with their effect columns.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicate SNPs from SNP ID.
Checking for SNPs with duplicated base-pair positions.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
Checking for bi-allelic SNPs.
67 SNPs (72%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7467aa1b3dd.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.302 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 G A 0.63060 -0.017 0.003 2.359e-10
3: rs34305371 1 72733610 G A 0.91231 -0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746769bf8aa.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7469a140b
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicate SNPs from SNP ID.
Checking for SNPs with duplicated base-pair positions.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
Checking for bi-allelic SNPs.
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
1 SNPs are non-biallelic. These will be removed.
Writing in tabular format ==> /tmp/RtmpOxg8AH/snp_bi_allelic.tsv.gz
46 SNPs (50%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746769bf8aa.tsv.gz
Summary statistics report:
- 92 rows (98.9% of original 93 rows)
- 92 unique variants
- 69 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.304 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746217a4b13.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7469a140b
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicate SNPs from SNP ID.
Checking for SNPs with duplicated base-pair positions.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
Checking for bi-allelic SNPs.
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 17 seconds.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746217a4b13.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.343 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Sorting coordinates with 'data.table'.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746ab212a7.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Found 1 Indels. These will be removed from the sumstats.
WARNING If you want to keep Indels, set the drop_indel param to FALSE & rerun MungeSumstats::format_sumstats()
Writing in tabular format ==> /tmp/RtmpOxg8AH/indel.tsv.gz
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74658b7910b.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7464953558b
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 92 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 16 seconds.
Checking A1 is uppercase
Checking A2 is uppercase
Effect/frq column(s) relate to A2 in the inputted sumstats
Found direction from matching reference genome - NOTE this assumes non-effect allele will match the reference genome
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Ensuring parameters comply with LDSC format.
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
1 SNP IDs are not correctly formatted. These will be corrected from the reference genome.
Loading SNPlocs data for build 144 on GRCH37.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Coercing BP column to numeric.
Ensuring all SNPs are on the reference genome.
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
Checking for correct direction of A1 (reference) and A2 (alternative allele).
There are 46 SNPs where A1 doesn't match the reference genome.
These will be flipped with their effect columns.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicate SNPs from SNP ID.
Checking for SNPs with duplicated base-pair positions.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
Checking for bi-allelic SNPs.
Computing Z-score from P using formula: `sign(BETA)*sqrt(stats::qchisq(P,1,lower=FALSE)`
Assigning N=1001 for all SNPs.
67 SNPs (72%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Renaming A1,A2 to match LDSC format.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74658b7910b.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.652 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 C T 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.63060 -0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.91231 -0.035 0.005 3.762e-14
4: rs2568955 1 72762169 C T 0.23690 -0.017 0.003 1.797e-08
IMPUTATION_SNP flipped Z IMPUTATION_z_score_p N
<lgcl> <lgcl> <num> <lgcl> <int>
1: NA NA 5.630777 TRUE 1001
2: NA TRUE -6.335939 TRUE 1001
3: NA TRUE -7.568968 TRUE 1001
4: NA NA -5.630488 TRUE 1001
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7465a90ac0b.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74679f3628d
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval N_CON N_CAS
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval N_CON N_CAS
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
Computing effective sample size using the LDSC method:
Neff = (N_CAS+N_CON) * (N_CAS/(N_CAS+N_CON)) / mean((N_CAS/(N_CAS+N_CON))[(N_CAS+N_CON)==max(N_CAS+N_CON)]))
Computing sample size using the sum method:
N = N_CAS + N_CON
Computing effective sample size using the GIANT method:
Neff = 2 / (1/N_CAS + 1/N_CON)
Computing effective sample size using the METAL method:
Neff = 4 / (1/N_CAS + 1/N_CON)
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7465a90ac0b.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.052 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P N_CON
<char> <int> <int> <char> <char> <num> <num> <num> <num> <int>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08 100
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10 100
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14 100
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08 100
N_CAS Neff_ldsc N Neff_giant Neff_metal
<int> <int> <int> <int> <int>
1: 120 220 220 109 218
2: 120 220 220 109 218
3: 120 220 220 109 218
4: 120 220 220 109 218
Returning path to saved data.
Loading required namespace: GenomicFiles
Using local VCF.
bgzip-compressing VCF file.
Finding empty VCF columns based on first 10,000 rows.
Dropping 1 duplicate column(s).
1 sample detected: EBI-a-GCST005647
Constructing ScanVcfParam object.
VCF contains: 39,630,630 variant(s) x 1 sample(s)
Reading VCF file: single-threaded
Converting VCF to data.table.
Expanding VCF first, so number of rows may increase.
Dropping 1 duplicate column(s).
Checking for empty columns.
Unlisting 3 columns.
Dropped 314 duplicate rows.
Time difference of 0.1 secs
VCF data.table contains: 101 rows x 11 columns.
Time difference of 0.4 secs
Renaming ID as SNP.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
No INFO (SI) column detected.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74612ffc33f.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74615da04d
Checking for empty columns.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
Infer Effect Column
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P N
Standardising column headers.
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P N
Summary statistics report:
- 101 rows
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
2 SNPs (2%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74612ffc33f.tsv.gz
Summary statistics report:
- 101 rows (100% of original 101 rows)
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 END FILTER FRQ BETA LP
<char> <int> <int> <char> <char> <int> <char> <num> <num> <num>
1: rs58108140 1 10583 G A 10583 PASS 0.1589 0.0312 0.369267
2: rs806731 1 30923 G T 30923 PASS 0.7843 -0.0114 0.126854
3: rs116400033 1 51479 T A 51479 PASS 0.1829 0.0711 1.262410
4: rs146477069 1 54421 A G 54421 PASS 0.0352 -0.0240 0.112102
SE P N
<num> <num> <int>
1: 0.0393 0.42730011 293723
2: 0.0353 0.74669974 293723
3: 0.0370 0.05464998 293723
4: 0.0830 0.77249913 293723
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74632fda1d.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
Infer Effect Column
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P N Beta
Standardising column headers.
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P N Beta
Summary statistics report:
- 101 rows
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
2 SNPs (2%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74632fda1d.tsv.gz
Summary statistics report:
- 101 rows (100% of original 101 rows)
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Done munging in 0.049 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 END FILTER FRQ ES LP
<char> <int> <int> <char> <char> <int> <char> <num> <num> <num>
1: rs58108140 1 10583 G A 10583 PASS 0.1589 0.0312 0.369267
2: rs806731 1 30923 G T 30923 PASS 0.7843 -0.0114 0.126854
3: rs116400033 1 51479 T A 51479 PASS 0.1829 0.0711 1.262410
4: rs146477069 1 54421 A G 54421 PASS 0.0352 -0.0240 0.112102
SE P N BETA
<num> <num> <int> <num>
1: 0.0393 0.42730011 293723 0.0312
2: 0.0353 0.74669974 293723 -0.0114
3: 0.0370 0.05464998 293723 0.0711
4: 0.0830 0.77249913 293723 -0.0240
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7461a4a18a0.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74615da04d
Checking for empty columns.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
Infer Effect Column
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP P N
Standardising column headers.
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP P N
Summary statistics report:
- 101 rows
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
The sumstats SE column is not present...Deriving SE from Beta and P
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
2 SNPs (2%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7461a4a18a0.tsv.gz
Summary statistics report:
- 101 rows (100% of original 101 rows)
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Done munging in 0.049 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 END FILTER FRQ BETA LP
<char> <int> <int> <char> <char> <int> <char> <num> <num> <num>
1: rs58108140 1 10583 G A 10583 PASS 0.1589 0.0312 0.369267
2: rs806731 1 30923 G T 30923 PASS 0.7843 -0.0114 0.126854
3: rs116400033 1 51479 T A 51479 PASS 0.1829 0.0711 1.262410
4: rs146477069 1 54421 A G 54421 PASS 0.0352 -0.0240 0.112102
P N SE IMPUTATION_SE
<num> <int> <num> <lgcl>
1: 0.42730011 293723 0.03930361 TRUE
2: 0.74669974 293723 0.03529477 TRUE
3: 0.05464998 293723 0.03699948 TRUE
4: 0.77249913 293723 0.08301411 TRUE
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7465de9d8ea.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74615da04d
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
SNP CHR BP A1 A2 FRQ Z SE P N
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
SNP CHR BP A1 A2 FRQ Z SE P N
Summary statistics report:
- 25 rows
- 25 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
The sumstats BETA column is not present...Deriving BETA from Z and SE
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
13 SNPs (52%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7465de9d8ea.tsv.gz
Summary statistics report:
- 25 rows (100% of original 25 rows)
- 25 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ Z SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs12184267 1 715265 C T 0.9591931 -0.916 0.007518884 0.3598
2: rs12184277 1 715367 A G 0.9589313 -0.656 0.007491601 0.5116
3: rs12184279 1 717485 C A 0.9594241 -1.050 0.007534860 0.2938
4: rs116801199 1 720381 G T 0.9578380 -0.300 0.007391344 0.7644
N BETA IMPUTATION_BETA
<int> <num> <lgcl>
1: 225955 -0.006887298 TRUE
2: 226215 -0.004914490 TRUE
3: 226224 -0.007911603 TRUE
4: 226626 -0.002217403 TRUE
Returning path to saved data.
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Sorting coordinates with 'data.table'.
Filtering SNPs based on INFO score.
46 SNPs are below the INFO threshold of 0.9 and will be removed.
Writing in tabular format ==> /tmp/RtmpOxg8AH/info_filter.tsv.gz
INFO_filter==0. Skipping INFO score filtering step.
Filtering SNPs based on INFO score.
All rows have INFO>=0.9
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Sorting coordinates with 'data.table'.
3 p-values are >1 which LDSC/MAGMA may not be able to handle. These will be converted to 1.
5 p-values are <0 which LDSC/MAGMA may not be able to handle. These will be converted to 0.
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Sorting coordinates with 'data.table'.
8 p-values are <=5e-324 which LDSC/MAGMA may not be able to handle. These will be converted to 0.
Reading header.
Tabular format detected.
Reading header.
Tabular format detected.
Reading header.
Tabular format detected.
Reading header.
VCF format detected.This will be converted to a standardised table format.
Importing tabular file: /home/biocbuild/bbs-3.22-bioc-longtests/meat/MungeSumstats.Rcheck/MungeSumstats/extdata/eduAttainOkbay.txt
Checking for empty columns.
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Computing Z-score from P using formula: `sign(BETA)*sqrt(stats::qchisq(P,1,lower=FALSE)`
Standardising column headers.
First line of summary statistics file:
SNP CHR BP A1 A2 FRQ BETA SE P Z newZ
Computing Z-score from BETA ans SE using formula: `BETA/SE`
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74656b041f6.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7466de9fc2
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName EAF Beta SE Pval CHR_BP_A2_A1
Standardising column headers.
First line of summary statistics file:
MarkerName EAF Beta SE Pval CHR_BP_A2_A1
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Summary statistics file does not have obvious CHR/BP columns. Checking to see if they are joined in another column.
Column CHR_BP_A2_A1 has been separated into the columns CHR, BP, A2, A1
If this is the incorrect format for the column, update the column name to the correct format e.g.`CHR:BP:A2:A1` and format_sumstats().
Standardising column headers.
First line of summary statistics file:
SNP FRQ BETA SE P CHR BP A2 A1
Checking for incorrect base-pair positions
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74656b041f6.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.094 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7467831f825.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7466de9fc2
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7467831f825.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.049 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7466e7b8739.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7463435d92
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName EAF Beta SE Pval CHR_BP_A2_A1
Standardising column headers.
First line of summary statistics file:
MarkerName EAF Beta SE Pval CHR_BP_A2_A1
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Summary statistics file does not have obvious CHR/BP columns. Checking to see if they are joined in another column.
Column CHR_BP_A2_A1 has been separated into the columns CHR, BP, A2, A1
If this is the incorrect format for the column, update the column name to the correct format e.g.`CHR:BP:A2:A1` and format_sumstats().
Standardising column headers.
First line of summary statistics file:
SNP FRQ BETA SE P CHR BP A2 A1
Checking for incorrect base-pair positions
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7466e7b8739.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.094 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746738c4e54.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7463435d92
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746738c4e54.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.049 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7462c485b76.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7466783bbab
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS EAF Beta SE Pval alleles allele
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS EAF Beta SE Pval alleles allele
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Warning: Multiple columns in the sumstats file seem to relate to alleles A1>A2.
The column ALLELES will be kept whereas the column(s) ALLELE will be removed.
If this is not the correct column to keep, please remove all incorrect columns from those listed here before
running `format_sumstats()`.
Column ALLELES has been separated into the columns A1, A2
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7462c485b76.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74619716d3d.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7466783bbab
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74619716d3d.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7462d5bd38.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7464a3c7e93
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName A1 A2 EAF Beta SE Pval CHR_BP
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName A1 A2 EAF Beta SE Pval CHR_BP
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Summary statistics file does not have obvious CHR/BP columns. Checking to see if they are joined in another column.
Column CHR_BP has been separated into the columns CHR, BP
Standardising column headers.
First line of summary statistics file:
SNP A1 A2 FRQ BETA SE P CHR BP
Checking for incorrect base-pair positions
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7462d5bd38.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.094 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74629cee854.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7464a3c7e93
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74629cee854.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74640335a8a.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74644d1f952
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName A1 A2 EAF Beta SE Pval CHR_BP CHR_BP_2
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName A1 A2 EAF Beta SE Pval CHR_BP CHR_BP_2
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Summary statistics file does not have obvious CHR/BP columns. Checking to see if they are joined in another column.
Warning: Multiple columns in the sumstats file seem to relate to Chromosome:Base Pair position.
The column CHR_BP_2 will be kept whereas the column(s) CHR_BP will be removed.
If this is not the correct column to keep, please remove all incorrect columns from those listed here before
running `format_sumstats()`.
Column CHR_BP_2 has been separated into the columns CHR, BP
Standardising column headers.
First line of summary statistics file:
SNP A1 A2 FRQ BETA SE P CHR BP
Checking for incorrect base-pair positions
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74640335a8a.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.096 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7464a441a49.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74644d1f952
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7464a441a49.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.049 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74632dd8fa.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e746269115b3
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74632dd8fa.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7466820ea95.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7463871ad9
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7466820ea95.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
Setting sorted=FALSE (required when formatted=FALSE).
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7465aebfc0a.tsv.gz
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Assigning N=1000 for all SNPs.
N already exists within sumstats_dt.
[1] "Testing: compute_n='ldsc'"
Computing effective sample size using the LDSC method:
Neff = (N_CAS+N_CON) * (N_CAS/(N_CAS+N_CON)) / mean((N_CAS/(N_CAS+N_CON))[(N_CAS+N_CON)==max(N_CAS+N_CON)]))
[1] "Testing: compute_n='giant'"
Computing effective sample size using the GIANT method:
Neff = 2 / (1/N_CAS + 1/N_CON)
[1] "Testing: compute_n='metal'"
Computing effective sample size using the METAL method:
Neff = 4 / (1/N_CAS + 1/N_CON)
[1] "Testing: compute_n='sum'"
Computing sample size using the sum method:
N = N_CAS + N_CON
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7462295a1b0.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7464d9f59cd
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7462295a1b0.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7467dd48ccd.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Saving output messages to:
/tmp/RtmpOxg8AH/file16e7467dd48ccd_log_msg.txt
Any runtime errors will be saved to:
/tmp/RtmpOxg8AH/file16e7467dd48ccd_log_output.txt
Messages will not be printed to terminal.
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7467d25199a.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74655242a5f
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7467d25199a.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.049 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7461a2991ee.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7466c79841e
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 186 rows
- 93 unique variants
- 140 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
93 sumstat rows are duplicated. These duplicates will be removed.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7461a2991ee.tsv.gz
Summary statistics report:
- 93 rows (50% of original 186 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746689f485b.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7466c79841e
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746689f485b.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.049 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7463c5fd525.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7466c79841e
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 94 rows
- 94 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicate SNPs from SNP ID.
Checking for SNPs with duplicated base-pair positions.
1 base-pair positions are duplicated in the sumstats file. These duplicates will be removed.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
Checking for bi-allelic SNPs.
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7463c5fd525.tsv.gz
Summary statistics report:
- 93 rows (98.9% of original 94 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.301 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746628b33bf.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e746477504fc
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Filtering effect columns, ensuring none equal 0.
5 SNPs have effect values = 0 and will be removed
Ensuring all SNPs have N<5 std dev above mean.
44 SNPs (50%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746628b33bf.tsv.gz
Summary statistics report:
- 88 rows (94.6% of original 93 rows)
- 88 unique variants
- 65 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.051 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7464a4ce9c3.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74627bd26fa
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval FRQ
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval FRQ
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs based on FRQ.
38 SNPs are below the FRQ threshold of 0.9 and will be removed.
Writing in tabular format ==> /tmp/RtmpOxg8AH/frq_filter.tsv.gz
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
55 SNPs (100%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7464a4ce9c3.tsv.gz
Summary statistics report:
- 55 rows (59.1% of original 93 rows)
- 55 unique variants
- 41 genome-wide significant variants (P<5e-8)
- 16 chromosomes
Done munging in 0.063 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 EAF BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
2: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
3: rs1008078 1 91189731 T C 0.37310 -0.016 0.003 6.005e-10
4: rs61787263 1 98618714 T C 0.76120 0.016 0.003 5.391e-08
FRQ
<num>
1: 1.863269
2: 1.169733
3: 1.401423
4: 1.873332
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7463211e11a.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74627bd26fa
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval FRQ
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval FRQ
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs based on FRQ.
38 SNPs are below the FRQ threshold of 0.9 and will be removed.
Writing in tabular format ==> /tmp/RtmpOxg8AH/frq_filter.tsv.gz
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
55 SNPs (100%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=FALSE, the FRQ column will be renamed MAJOR_ALLELE_FRQ to differentiate the values from
minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7463211e11a.tsv.gz
Summary statistics report:
- 55 rows (59.1% of original 93 rows)
- 55 unique variants
- 41 genome-wide significant variants (P<5e-8)
- 16 chromosomes
Done munging in 0.047 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 EAF BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
2: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
3: rs1008078 1 91189731 T C 0.37310 -0.016 0.003 6.005e-10
4: rs61787263 1 98618714 T C 0.76120 0.016 0.003 5.391e-08
MAJOR_ALLELE_FRQ
<num>
1: 1.863269
2: 1.169733
3: 1.401423
4: 1.873332
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7462dc3949e.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74622ff8054
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
SNP CHR BP A1 A2 FRQ BETA SE P
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
SNP CHR BP A1 A2 FRQ BETA SE P
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7462dc3949e.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.047 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746ea6c090.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Infer Effect Column
First line of summary statistics file:
SNP CHR BP A1 A2 Uniq.a1a2 EAF BETA P
Allele columns are ambiguous, attempting to infer direction
Standardising column headers.
First line of summary statistics file:
SNP CHR BP A1 A2 Uniq.a1a2 EAF BETA P
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 2 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 13 seconds.
Checking A1 is uppercase
Checking A2 is uppercase
Effect/frq column(s) relate to A1 in the inputted sumstats
Found direction from matching reference genome - NOTE this assumes non-effect allele will match the reference genome
Standardising column headers.
First line of summary statistics file:
SNP CHR BP A2 A1 UNIQ.A1A2 EAF BETA P
Summary statistics report:
- 3 rows
- 3 unique variants
- 1 genome-wide significant variants (P<5e-8)
- 2 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Writing in tabular format ==> /tmp/RtmpOxg8AH/snp_missing_rs.tsv.gz
Writing in tabular format ==> /tmp/RtmpOxg8AH/snp_multi_colon.tsv.gz
1 SNP IDs appear to be made up of chr:bp, these will be replaced by their SNP ID from the reference genome
Loading SNPlocs data for build 144 on GRCH37.
Found Indels. These won't be checked against the reference genome as it does not contain Indels.
WARNING If your sumstat doesn't contain Indels, set the indel param to FALSE & rerun MungeSumstats::format_sumstats()
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Coercing BP column to numeric.
Ensuring all SNPs are on the reference genome.
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 2 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 13 seconds.
Found 1 Indels. These won't be checked against the reference genome as it does not contain Indels.
WARNING If your sumstat doesn't contain Indels, set the indel param to FALSE & rerun MungeSumstats::format_sumstats()
Checking for correct direction of A1 (reference) and A2 (alternative allele).
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicate SNPs from SNP ID.
Found 1 Indels. These won't be checked for duplicates based on RS ID as there can be multiples.
WARNING If your sumstat doesn't contain Indels, set the indel param to FALSE & rerun MungeSumstats::format_sumstats()
Checking for SNPs with duplicated base-pair positions.
Found 1 Indels. These won't be checked for duplicates based on base-pair position as there can be multiples.
WARNING If your sumstat doesn't contain Indels, set the indel param to FALSE & rerun MungeSumstats::format_sumstats()
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
SE is not present but can be imputed with BETA & P. Set impute_se=TRUE and rerun to do this.
Ensuring all SNPs have N<5 std dev above mean.
Checking for bi-allelic SNPs.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746ea6c090.tsv.gz
Summary statistics report:
- 2 rows (66.7% of original 3 rows)
- 2 unique variants
- 1 genome-wide significant variants (P<5e-8)
- 2 chromosomes
Done munging in 0.568 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 UNIQ.A1A2 FRQ
<char> <int> <int> <char> <char> <char> <num>
1: rs12987662 2 100821548 C A aa 0.3787000
2: rs34589910 4 6364621 CG C 4:6364621_C_CG 0.0945334
BETA P
<num> <num>
1: 0.027000000 2.693000e-24
2: -0.006257323 4.883341e-01
Returning data directly.
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7464c3a4260.tsv
Converting full summary stats file to tabix format for fast querying...
Reading header.
Ensuring file is bgzipped.
Tabix-indexing file.
Removing temporary .tsv file.
Reading header.
Reading entire file.
Sorting coordinates with 'GenomicRanges'.
Converting summary statistics to GenomicRanges.
Sorting coordinates with 'data.table'.
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Sorting coordinates with 'data.table'.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74669e1bd6a.tsv.gz
Infer Effect Column
First line of summary statistics file:
SNP CHR BP non_effect_allele effect_allele FRQ BETA1 SE P
Standardising column headers.
First line of summary statistics file:
SNP CHR BP non_effect_allele effect_allele FRQ BETA1 SE P
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74669e1bd6a.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.046 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning data directly.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74655bcfcaa.tsv.gz
Infer Effect Column
First line of summary statistics file:
SNP CHR BP A2 A1 FRQ BETA1 SE P
Allele columns are ambiguous, attempting to infer direction
Found direction from effect/frq column naming
Standardising column headers.
First line of summary statistics file:
SNP CHR BP A2 A1 FRQ BETA1 SE P
Effect/frq column(s) relate to A1 in the sumstat
Standardising column headers.
First line of summary statistics file:
SNP CHR BP A1 A2 FRQ BETA1 SE P
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74655bcfcaa.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.094 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning data directly.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746365fba9c.tsv.gz
Infer Effect Column
First line of summary statistics file:
SNP CHR BP A2 A1 A1FRQ BETA SE P
Allele columns are ambiguous, attempting to infer direction
Found direction from effect/frq column naming
Standardising column headers.
First line of summary statistics file:
SNP CHR BP A2 A1 A1FRQ BETA SE P
Effect/frq column(s) relate to A1 in the sumstat
Standardising column headers.
First line of summary statistics file:
SNP CHR BP A1 A2 A1FRQ BETA SE P
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746365fba9c.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.093 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning data directly.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74660ac8bf6.tsv.gz
Infer Effect Column
First line of summary statistics file:
SNP CHR BP A2 A1 FRQ BETA SE P
Allele columns are ambiguous, attempting to infer direction
Standardising column headers.
First line of summary statistics file:
SNP CHR BP A2 A1 FRQ BETA SE P
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 76 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
Checking A1 is uppercase
Checking A2 is uppercase
Effect/frq column(s) relate to A1 in the inputted sumstats
Found direction from matching reference genome - NOTE this assumes non-effect allele will match the reference genome
Standardising column headers.
First line of summary statistics file:
SNP CHR BP A1 A2 FRQ BETA SE P
Summary statistics report:
- 76 rows
- 76 unique variants
- 55 genome-wide significant variants (P<5e-8)
- 19 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Ensuring all SNPs are on the reference genome.
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 76 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
41 SNPs (53.9%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74660ac8bf6.tsv.gz
Summary statistics report:
- 76 rows (100% of original 76 rows)
- 76 unique variants
- 55 genome-wide significant variants (P<5e-8)
- 19 chromosomes
Done munging in 0.614 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs114598875 2 60976384 A G 0.8246 -0.020 0.004 2.405e-08
2: rs13402908 2 100333377 T C 0.5056 -0.018 0.003 1.695e-11
3: rs34106693 2 101151830 C G 0.8190 0.020 0.004 7.527e-08
4: rs17824247 2 144152539 T C 0.5802 -0.016 0.003 2.766e-09
Returning data directly.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7464d3206c8.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7467b5dd450
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval INFO
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval INFO
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
Filtering SNPs based on INFO score.
38 SNPs are below the INFO threshold of 0.9 and will be removed.
Writing in tabular format ==> /tmp/RtmpOxg8AH/info_filter.tsv.gz
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
28 SNPs (50.9%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7464d3206c8.tsv.gz
Summary statistics report:
- 55 rows (59.1% of original 93 rows)
- 55 unique variants
- 41 genome-wide significant variants (P<5e-8)
- 16 chromosomes
Done munging in 0.052 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
2: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
3: rs1008078 1 91189731 T C 0.37310 -0.016 0.003 6.005e-10
4: rs61787263 1 98618714 T C 0.76120 0.016 0.003 5.391e-08
INFO
<num>
1: 1.863269
2: 1.169733
3: 1.401423
4: 1.873332
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7466ef10bc7.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7467e6368d1
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7466ef10bc7.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746718b08c3.tsv.gz
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746718b08c3.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.047 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Sorting coordinates with 'data.table'.
Performing data liftover from hg19 to hg38.
Converting summary statistics to GenomicRanges.
Downloading chain file...
Downloading chain file from Ensembl.
trying URL 'ftp://ftp.ensembl.org/pub/assembly_mapping/homo_sapiens/GRCh37_to_GRCh38.chain.gz'
Content type 'unknown' length 285250 bytes (278 KB)
==================================================
/tmp/RtmpOxg8AH/GRCh37_to_GRCh38.chain.gz
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Performing data liftover from hg19 to hg38.
Converting summary statistics to GenomicRanges.
Downloading chain file...
Using existing chain file from ensembl.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7464fb5a7d9.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74636436e0b
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
Checking A1 is uppercase
Checking A2 is uppercase
Effect/frq column(s) relate to A2 in the inputted sumstats
Found direction from matching reference genome - NOTE this assumes non-effect allele will match the reference genome
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Ensuring all SNPs are on the reference genome.
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Performing data liftover from hg19 to hg38.
Converting summary statistics to GenomicRanges.
Downloading chain file...
Using existing chain file from ensembl.
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7464fb5a7d9.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.615 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8430543 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43516856 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72267927 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72296486 T C 0.23690 -0.017 0.003 1.797e-08
IMPUTATION_gen_build
<lgcl>
1: TRUE
2: TRUE
3: TRUE
4: TRUE
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7463400d5a2.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Importing tabular file: /tmp/RtmpOxg8AH/file16e7464fb5a7d9.tsv.gz
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
SNP CHR BP A1 A2 FRQ BETA SE P IMPUTATION_gen_build
Allele columns are ambiguous, attempting to infer direction
Standardising column headers.
First line of summary statistics file:
SNP CHR BP A1 A2 FRQ BETA SE P IMPUTATION_gen_build
Loading SNPlocs data for build 144 on GRCH38.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 42 seconds.
Checking A1 is uppercase
Checking A2 is uppercase
Effect/frq column(s) relate to A2 in the inputted sumstats
Found direction from matching reference genome - NOTE this assumes non-effect allele will match the reference genome
Standardising column headers.
First line of summary statistics file:
SNP CHR BP A1 A2 FRQ BETA SE P IMPUTATION_gen_build
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Ensuring all SNPs are on the reference genome.
Loading SNPlocs data for build 144 on GRCH38.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 30 seconds.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Performing data liftover from hg38 to hg19.
Converting summary statistics to GenomicRanges.
Downloading chain file...
Using existing chain file from ensembl.
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7463400d5a2.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 1.368 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
IMPUTATION_GEN_BUILD IMPUTATION_gen_build
<lgcl> <lgcl>
1: TRUE TRUE
2: TRUE TRUE
3: TRUE TRUE
4: TRUE TRUE
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7466070f812.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74636436e0b
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 17 seconds.
Checking A1 is uppercase
Checking A2 is uppercase
Effect/frq column(s) relate to A2 in the inputted sumstats
Found direction from matching reference genome - NOTE this assumes non-effect allele will match the reference genome
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Ensuring all SNPs are on the reference genome.
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 16 seconds.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7466070f812.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.673 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7464acfd6a0.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74636436e0b
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 15 seconds.
Checking A1 is uppercase
Checking A2 is uppercase
Effect/frq column(s) relate to A2 in the inputted sumstats
Found direction from matching reference genome - NOTE this assumes non-effect allele will match the reference genome
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Ensuring all SNPs are on the reference genome.
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Performing data liftover from hg19 to hg38.
Converting summary statistics to GenomicRanges.
Using local chain file...
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7464acfd6a0.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.628 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8430543 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43516856 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72267927 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72296486 T C 0.23690 -0.017 0.003 1.797e-08
IMPUTATION_gen_build
<lgcl>
1: TRUE
2: TRUE
3: TRUE
4: TRUE
Returning path to saved data.
[1] "/tmp/RtmpOxg8AH/data/file1/file16e74621661c1c.tsv.gz"
[1] "/tmp/RtmpOxg8AH/data/file2/file16e7464e88e403.tsv.gz"
[1] "/tmp/RtmpOxg8AH/data/file3/file16e7469a44e21.tsv.gz"
[1] "/tmp/RtmpOxg8AH/data/file4/file16e7466e58b954.tsv.gz"
[1] "/tmp/RtmpOxg8AH/data/file5/file16e7464d15e4b4.tsv.gz"
[1] "/tmp/RtmpOxg8AH/data/file6/file16e746141e377b.tsv.gz"
[1] "/tmp/RtmpOxg8AH/data/file7/file16e74644230205.tsv.gz"
[1] "/tmp/RtmpOxg8AH/data/file8/file16e74627a24899.tsv.gz"
[1] "/tmp/RtmpOxg8AH/data/file9/file16e74654db8edf.tsv.gz"
[1] "/tmp/RtmpOxg8AH/data/file10/file16e746ef2d8a5.tsv.gz"
10 file(s) found.
Parsing info from 10 log file(s).
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74641d71430.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7467564958e
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 92 unique variants
- 69 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
WARNING: 1 rows in sumstats file are missing data and will be removed.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
46 SNPs (50%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74641d71430.tsv.gz
Summary statistics report:
- 92 rows (98.9% of original 93 rows)
- 92 unique variants
- 69 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs10061788 5 87934707 A G 0.2164 0.021 0.004 2.464e-09
2: rs1007883 16 51163406 T C 0.3713 -0.015 0.003 5.326e-08
3: rs1008078 1 91189731 T C 0.3731 -0.016 0.003 6.005e-10
4: rs1043209 14 23373986 A G 0.6026 0.018 0.003 1.816e-11
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74648295d20.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7467564958e
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74648295d20.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.067 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs10061788 5 87934707 A G 0.2164 0.021 0.004 2.464e-09
2: rs1007883 16 51163406 T C 0.3713 -0.015 0.003 5.326e-08
3: rs1008078 1 91189731 T C 0.3731 -0.016 0.003 6.005e-10
4: rs1043209 14 23373986 A G 0.6026 0.018 0.003 1.816e-11
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7466c3f337b.tsv.gz
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 21 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 1 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 2 seconds.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7466c3f337b.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.093 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs10061788 5 87934707 A G 0.2164 0.021 0.004 2.464e-09
2: rs1007883 16 51163406 T C 0.3713 -0.015 0.003 5.326e-08
3: rs1008078 1 91189731 T C 0.3731 -0.016 0.003 6.005e-10
4: rs1043209 14 23373986 A G 0.6026 0.018 0.003 1.816e-11
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74669226ad9.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74652b23c0b
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
1 SNPs found with multiple RSIDs on one row, the first will be taken. If you would rather remove these SNPs set
`remove_multi_rs_snp=TRUE`.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74669226ad9.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.049 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
convert_multi_rs_SNP
<lgcl>
1: NA
2: NA
3: NA
4: NA
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746706f97c5.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74652b23c0b
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746706f97c5.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.047 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746419d509e.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7461653fcd9
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 92 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Writing in tabular format ==> /tmp/RtmpOxg8AH/snp_multi_rs_one_row.tsv.gz
1 SNPs found with multiple RSIDs on one row, these will be removed. If you would rather take the first RS ID set
`remove_multi_rs_snp`=FALSE
Checking SNP RSIDs.
1 SNP IDs are not correctly formatted. These will be corrected from the reference genome.
Loading SNPlocs data for build 144 on GRCH37.
Writing in tabular format ==> /tmp/RtmpOxg8AH/snp_not_found_from_chr_bp.tsv.gz
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Coercing BP column to numeric.
Checking for missing data.
WARNING: 1 rows in sumstats file are missing data and will be removed.
Writing in tabular format ==> /tmp/RtmpOxg8AH/missing_data.tsv.gz
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
1 SNPs have SE values <= 0 and will be removed
Writing in tabular format ==> /tmp/RtmpOxg8AH/se_neg.tsv.gz
Ensuring all SNPs have N<5 std dev above mean.
Checking for strand ambiguous SNPs.
8 SNPs are strand-ambiguous alleles including 4 A/T and 4 C/G ambiguous SNPs. These will be removed
Writing in tabular format ==> /tmp/RtmpOxg8AH/snp_strand_ambiguous.tsv.gz
41 SNPs (50.6%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746419d509e.tsv.gz
Summary statistics report:
- 81 rows (87.1% of original 93 rows)
- 80 unique variants
- 59 genome-wide significant variants (P<5e-8)
- 19 chromosomes
Done munging in 0.049 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
3: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
4: rs1008078 1 91189731 T C 0.37310 -0.016 0.003 6.005e-10
IMPUTATION_SNP
<lgcl>
1: NA
2: NA
3: NA
4: NA
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7468a5446e.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74632fcf2bb
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
chromosome rs_id markername position_hg18 Effect_allele Other_allele EAF_HapMapCEU N_SMK Effect_SMK StdErr_SMK P_value_SMK N_NONSMK Effect_NonSMK StdErr_NonSMK P_value_NonSMK
Standardising column headers.
First line of summary statistics file:
chromosome rs_id markername position_hg18 Effect_allele Other_allele EAF_HapMapCEU N_SMK Effect_SMK StdErr_SMK P_value_SMK N_NONSMK Effect_NonSMK StdErr_NonSMK P_value_NonSMK
Summary statistics report:
- 5 rows
- 5 unique variants
- 1 chromosomes
Checking for multi-GWAS.
WARNING: Multiple traits found in sumstats file only one of which can be analysed:
SMK, NONSMK
Standardising column headers.
First line of summary statistics file:
CHR SNP MARKERNAME POSITION_HG18 A2 A1 EAF_HAPMAPCEU N EFFECT STDERR P_VALUE N_NONSMK EFFECT_NONSMK STDERR_NONSMK P_VALUE_NONSMK
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
1 SNP IDs are not correctly formatted and will be removed.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Summary statistics file does not have obvious CHR/BP columns. Checking to see if they are joined in another column.
Column MARKERNAME has been separated into the columns CHR, BP
Standardising column headers.
First line of summary statistics file:
CHR SNP POSITION_HG18 A2 A1 EAF_HAPMAPCEU N BETA SE P N_NONSMK EFFECT_NONSMK STDERR_NONSMK P_VALUE_NONSMK BP
Checking for incorrect base-pair positions
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Ensuring that the N column is all integers.
The sumstats N column is not all integers, this could effect downstream analysis. These will be converted to integers.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7468a5446e.tsv.gz
Summary statistics report:
- 4 rows (80% of original 5 rows)
- 4 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Done munging in 0.139 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 POSITION_HG18 EAF_HAPMAPCEU N
<char> <char> <int> <char> <char> <int> <num> <int>
1: rs1000050 chr1 161003087 C T 161003087 0.9000 36257
2: rs1000073 chr1 155522020 G A 155522020 0.3136 36335
3: rs1000075 chr1 94939420 C T 94939420 0.3583 38959
4: rs1000085 chr1 66630503 G C 66630503 0.1667 38761
BETA SE P N_NONSMK EFFECT_NONSMK STDERR_NONSMK P_VALUE_NONSMK
<num> <num> <num> <int> <num> <num> <num>
1: 0.0001 0.0109 0.9931 127514 0.0058 0.0059 0.3307
2: 0.0046 0.0083 0.5812 126780 0.0038 0.0045 0.3979
3: -0.0013 0.0082 0.8687 147567 -0.0043 0.0044 0.3259
4: 0.0053 0.0095 0.5746 147259 -0.0034 0.0052 0.5157
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7464238f14c.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74635339380
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval N N_fixed
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval N N_fixed
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Ensuring that the N column is all integers.
The sumstats N column is not all integers, this could effect downstream analysis. These will be converted to integers.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7464238f14c.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P N
<char> <int> <int> <char> <char> <num> <num> <num> <num> <int>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08 5
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10 1
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14 1
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08 7
N_FIXED
<int>
1: 5
2: 1
3: 1
4: 7
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7461bba273f.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e746449d4eca
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval N
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval N
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
The sumstats N column is not all integers, this could effect downstream analysis.These will NOT be converted to integers.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
1 SNPs have N values 5 standard deviations above the mean and will be removed
Writing in tabular format ==> /tmp/RtmpOxg8AH/n_large.tsv.gz
47 SNPs (51.1%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7461bba273f.tsv.gz
Summary statistics report:
- 92 rows (98.9% of original 93 rows)
- 92 unique variants
- 69 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P N
<char> <int> <int> <char> <char> <num> <num> <num> <num> <int>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08 3
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10 5
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14 3
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08 3
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746cea72b.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e746449d4eca
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval N
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval N
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
The sumstats N column is not all integers, this could effect downstream analysis.These will NOT be converted to integers.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
1 SNPs have N values 5 standard deviations above the mean and will be removed
Writing in tabular format ==> /tmp/RtmpOxg8AH/n_large.tsv.gz
47 SNPs (51.1%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746cea72b.tsv.gz
Summary statistics report:
- 92 rows (98.9% of original 93 rows)
- 92 unique variants
- 69 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P N
<char> <int> <int> <char> <char> <num> <num> <num> <num> <int>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08 3
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10 5
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14 3
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08 3
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7466c8b17d1.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e746449d4eca
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval N
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval N
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
The sumstats N column is not all integers, this could effect downstream analysis.These will NOT be converted to integers.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
1 SNPs have N values 5 standard deviations above the mean and will be removed
Writing in tabular format ==> /tmp/RtmpOxg8AH/n_large.tsv.gz
Removing rows where is.na(N)
0 SNPs have N values that are NA and will be removed.
Writing in tabular format ==> /tmp/RtmpOxg8AH/n_null.tsv.gz
47 SNPs (51.1%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7466c8b17d1.tsv.gz
Summary statistics report:
- 92 rows (98.9% of original 93 rows)
- 92 unique variants
- 69 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.047 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P N
<char> <int> <int> <char> <char> <num> <num> <num> <num> <int>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08 3
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10 5
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14 3
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08 3
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7466e95004f.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7461a5444f5
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS EAF Beta SE Pval
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking for incorrect base-pair positions
WARNING: No A2 column found in the data, multi-allelic can't not be accurately chosen (as any
of the choices could be valid). bi_allelic_filter has been forced to TRUE.
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 15 seconds.
Deriving both A1 and A2 from reference genome
WARNING: Inferring the alternative allele (A2) from the reference genome. In some instances, there are more than one
alternative allele. Arbitrarily, only the first will be kept. See column `alt_alleles` in your returned sumstats file
for all alternative alleles.
Writing in tabular format ==> /tmp/RtmpOxg8AH/alleles_not_found_from_snp.tsv.gz
Checking for missing data.
Checking for duplicate columns.
Checking for duplicate SNPs from SNP ID.
Checking for SNPs with duplicated base-pair positions.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
Checking for bi-allelic SNPs.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7466e95004f.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.314 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 G A 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 G A 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
alt_alleles IMPUTATION_A1 IMPUTATION_A2
<char> <lgcl> <lgcl>
1: C TRUE TRUE
2: A TRUE TRUE
3: A TRUE TRUE
4: C TRUE TRUE
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74640b745e7.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7461a5444f5
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A2 EAF Beta SE Pval
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A2 is uppercase
Checking for incorrect base-pair positions
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
One of A1/A2 are missing, allele flipping will be tested
Deriving A1 from reference genome
Writing in tabular format ==> /tmp/RtmpOxg8AH/alleles_not_found_from_snp.tsv.gz
Checking for correct direction of A1 (reference) and A2 (alternative allele).
There are 46 SNPs where A1 doesn't match the reference genome.
These will be flipped with their effect columns.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74640b745e7.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.309 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 G G 0.36940 -0.017 0.003 2.359e-10
3: rs34305371 1 72733610 G G 0.08769 -0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
IMPUTATION_A1 flipped
<lgcl> <lgcl>
1: TRUE NA
2: TRUE TRUE
3: TRUE TRUE
4: TRUE NA
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746763cfde8.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7461a5444f5
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 EAF Beta SE Pval
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking for incorrect base-pair positions
WARNING: No A2 column found in the data, multi-allelic can't not be accurately chosen (as any
of the choices could be valid). bi_allelic_filter has been forced to TRUE.
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
One of A1/A2 are missing, allele flipping will be tested
Deriving A2 from reference genome
WARNING: Inferring the alternative allele (A2) from the reference genome. In some instances, there are more than one
alternative allele. Arbitrarily, only the first will be kept. See column `alt_alleles` in your returned sumstats file
for all alternative alleles.
Writing in tabular format ==> /tmp/RtmpOxg8AH/alleles_not_found_from_snp.tsv.gz
Checking for correct direction of A1 (reference) and A2 (alternative allele).
Checking for missing data.
Checking for duplicate columns.
Checking for duplicate SNPs from SNP ID.
Checking for SNPs with duplicated base-pair positions.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
Checking for bi-allelic SNPs.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746763cfde8.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.305 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A A 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A A 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
alt_alleles IMPUTATION_A2
<char> <lgcl>
1: C TRUE
2: A TRUE
3: A TRUE
4: C TRUE
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74660818804.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7461a5444f5
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for correct direction of A1 (reference) and A2 (alternative allele).
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
There are 46 SNPs where A1 doesn't match the reference genome.
These will be flipped with their effect columns.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74660818804.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.305 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 G A 0.36940 -0.017 0.003 2.359e-10
3: rs34305371 1 72733610 G A 0.08769 -0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7467ac6206d.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74614bd252a
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Summary statistics file does not have obvious CHR/BP columns. Checking to see if they are joined in another column.
Standardising column headers.
First line of summary statistics file:
SNP BP A1 A2 FRQ BETA SE P
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7467ac6206d.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.359 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7463a06bd29.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7467b3acfdb
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Summary statistics file does not have obvious CHR/BP columns. Checking to see if they are joined in another column.
Standardising column headers.
First line of summary statistics file:
SNP A1 A2 FRQ BETA SE P
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
Writing in tabular format ==> /tmp/RtmpOxg8AH/chr_bp_not_found_from_snp.tsv.gz
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7463a06bd29.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.351 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7461f6080b3.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74672fa1e88
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
1 SNP IDs are not correctly formatted. These will be corrected from the reference genome.
Loading SNPlocs data for build 144 on GRCH37.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Coercing BP column to numeric.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7461f6080b3.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.052 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746f9554b4.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74672fa1e88
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746f9554b4.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74671b8bf1c.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7463c0fe617
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
1 SNP IDs appear to be made up of chr:bp, these will be replaced by their SNP ID from the reference genome
Loading SNPlocs data for build 144 on GRCH37.
1 SNP IDs are not correctly formatted and will be removed.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Summary statistics file does not have obvious CHR/BP columns. Checking to see if they are joined in another column.
Standardising column headers.
First line of summary statistics file:
SNP A1 A2 FRQ BETA SE P
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 92 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
46 SNPs (50%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74671b8bf1c.tsv.gz
Summary statistics report:
- 92 rows (98.9% of original 93 rows)
- 92 unique variants
- 69 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.346 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7463e2e46e1.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e746550eba54
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
1 SNP IDs are not correctly formatted. These will be corrected from the reference genome.
Loading SNPlocs data for build 144 on GRCH37.
1 SNP IDs appear to be made up of chr:bp, these will be replaced by their SNP ID from the reference genome
Loading SNPlocs data for build 144 on GRCH37.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Coercing BP column to numeric.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7463e2e46e1.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.055 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7464bce0060.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74656a0001a
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7461b5a25fa.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7463c0fe617
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7461b5a25fa.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.049 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74655d73dd7.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7463ec8cd4d
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Loading SNPlocs data for build 144 on GRCH37.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74655d73dd7.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.099 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746770f0575.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7461b98ae56
Checking for empty columns.
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Ensuring all SNPs are on the reference genome.
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 15 seconds.
1 SNPs are not on the reference genome. These will be corrected from the reference genome.
Loading SNPlocs data for build 144 on GRCH37.
Writing in tabular format ==> /tmp/RtmpOxg8AH/snp_not_found_from_chr_bp.tsv.gz
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 93 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 15 seconds.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746770f0575.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.584 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746430db085.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7461b98ae56
Checking for empty columns.
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746430db085.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.049 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
Inferring genome build of 1 sumstats file(s).
Inferring genome build.
Reading in only the first 19 rows of sumstats.
Importing tabular file: /home/biocbuild/bbs-3.22-bioc-longtests/meat/MungeSumstats.Rcheck/MungeSumstats/extdata/eduAttainOkbay.txt
Checking for empty columns.
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 10 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 14 seconds.
Loading SNPlocs data for build 144 on GRCH38.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 10 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 18 seconds.
Inferred genome build: GRCH37
Time difference of 37.27024 secs
GRCH37: 1 file(s)
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74655d45352.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e746737e03af
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 23 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
1 SNPs have been removed as their BP column is not in the range of 1 to the length of the chromosome
Writing in tabular format ==> /tmp/RtmpOxg8AH/bad_bp.tsv.gz
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
2 SNPs are on chromosomes X, Y, MT and will be removed.
Writing in tabular format ==> /tmp/RtmpOxg8AH/chr_excl.tsv.gz
45 SNPs (50%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74655d45352.tsv.gz
Summary statistics report:
- 90 rows (96.8% of original 93 rows)
- 90 unique variants
- 67 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.053 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7466d735049.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e746737e03af
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7466d735049.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
Reading header.
Reading entire file.
Reading header.
Reading header.
Reading header.
Reading header.
Reading header.
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74618e50a3b
Checking for empty columns.
Standardising column headers.
First line of summary statistics file:
SNP CHR BP A1 A2 FRQ BETA SE P
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7466fe91ec1
Checking for empty columns.
Standardising column headers.
First line of summary statistics file:
SNP CHR BP A1 A2 FRQ BETA SE P
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74656ed6efa.vcf.bgz
Sorting coordinates with 'data.table'.
Converting summary statistics to GenomicRanges.
Converting summary statistics to VRanges.
Writing in VCF format ==> /tmp/RtmpOxg8AH/file16e74656ed6efa.vcf.bgz
Using local VCF.
Finding empty VCF columns based on first 10,000 rows.
1 sample detected: GWAS
Constructing ScanVcfParam object.
Reading VCF file: single-threaded
Converting VCF to data.table.
Expanding VCF first, so number of rows may increase.
Checking for empty columns.
Time difference of 0.1 secs
VCF data.table contains: 93 rows x 11 columns.
Time difference of 0.3 secs
No INFO (SI) column detected.
Standardising column headers.
First line of summary statistics file:
ID chr BP end REF ALT SNP FRQ BETA SE P
Using local VCF.
bgzip-compressing VCF file.
Finding empty VCF columns based on first 10,000 rows.
Dropping 1 duplicate column(s).
1 sample detected: EBI-a-GCST005647
Constructing ScanVcfParam object.
VCF contains: 39,630,630 variant(s) x 1 sample(s)
Reading VCF file: single-threaded
Converting VCF to data.table.
Expanding VCF first, so number of rows may increase.
Dropping 1 duplicate column(s).
Checking for empty columns.
Unlisting 3 columns.
Dropped 314 duplicate rows.
Time difference of 0.1 secs
VCF data.table contains: 101 rows x 11 columns.
Time difference of 0.4 secs
Renaming ID as SNP.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
No INFO (SI) column detected.
Standardising column headers.
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7463fa222e2.vcf.bgz
Sorting coordinates with 'data.table'.
Converting summary statistics to GenomicRanges.
Converting summary statistics to VRanges.
Writing in VCF format ==> /tmp/RtmpOxg8AH/file16e7463fa222e2.vcf.bgz
Using local VCF.
Finding empty VCF columns based on first 10,000 rows.
1 sample detected: GWAS
Constructing ScanVcfParam object.
Reading VCF file: single-threaded
Converting VCF to data.table.
Expanding VCF first, so number of rows may increase.
Checking for empty columns.
Time difference of 0.1 secs
VCF data.table contains: 101 rows x 13 columns.
Time difference of 0.3 secs
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
No INFO (SI) column detected.
Standardising column headers.
First line of summary statistics file:
ID chr BP end REF SNP END FILTER FRQ BETA LP SE P
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7466e044d42.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Infer Effect Column
First line of summary statistics file:
SNP P FRQ BETA CHR BP
Standardising column headers.
First line of summary statistics file:
SNP P FRQ BETA CHR BP
Summary statistics report:
- 5 rows
- 5 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
5 SNP IDs contain other information in the same column. These will be separated.
Checking for merged allele column.
Column SNP_INFO has been separated into the columns A1, A2
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Coercing BP column to numeric.
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
SE is not present but can be imputed with BETA & P. Set impute_se=TRUE and rerun to do this.
Ensuring all SNPs have N<5 std dev above mean.
3 SNPs (60%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7466e044d42.tsv.gz
Summary statistics report:
- 5 rows (100% of original 5 rows)
- 5 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 P FRQ BETA
<char> <int> <int> <char> <char> <num> <num> <num>
1: rs140052487 1 54353 C A 0.037219838 0.3000548 0.8797957
2: rs558796213 1 54564 G T 0.004382482 0.5848666 0.7068747
3: rs561234294 1 54591 A G 0.070968402 0.3334671 0.7319726
4: rs2462492 1 54676 C T 0.065769040 0.6220120 0.9316344
Returning data directly.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
******::NOTE::******
- Log results will be saved to `tempdir()` by default.
- This means all log data from the run will be deleted upon ending the R session.
- To keep it, change `log_folder` to an actual directory (e.g. log_folder='./').
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74658f3c73d.tsv.gz
Log data to be saved to ==> /tmp/RtmpOxg8AH
Infer Effect Column
First line of summary statistics file:
SNP P FRQ BETA CHR BP A1 A2
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
SNP P FRQ BETA CHR BP A1 A2
Summary statistics report:
- 5 rows
- 5 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Coercing BP column to numeric.
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
SE is not present but can be imputed with BETA & P. Set impute_se=TRUE and rerun to do this.
Ensuring all SNPs have N<5 std dev above mean.
3 SNPs (60%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74658f3c73d.tsv.gz
Summary statistics report:
- 5 rows (100% of original 5 rows)
- 5 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 P FRQ BETA
<char> <int> <int> <char> <char> <num> <num> <num>
1: rs140052487 1 54353 C A 0.037219838 0.3000548 0.8797957
2: rs558796213 1 54564 G T 0.004382482 0.5848666 0.7068747
3: rs561234294 1 54591 A G 0.070968402 0.3334671 0.7319726
4: rs2462492 1 54676 C T 0.065769040 0.6220120 0.9316344
Returning data directly.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7462be91af.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74694b28fc
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746354e8c2f.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74629ac2441
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746354e8c2f.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74674f0af12.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74629ac2441
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74674f0af12.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746333dece4.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e746559a264
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746333dece4.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74672ae011a.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74651c1365b
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74672ae011a.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7467a317795.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e746787a135c
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
5 SNPs have SE values <= 0 and will be removed
Ensuring all SNPs have N<5 std dev above mean.
44 SNPs (50%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7467a317795.tsv.gz
Summary statistics report:
- 88 rows (94.6% of original 93 rows)
- 88 unique variants
- 65 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval Support
Returning unmapped column names without making them uppercase.
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval Support
Returning unmapped column names without making them uppercase.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74616fb7223.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7466d17ea06
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 85 rows
- 85 unique variants
- 63 genome-wide significant variants (P<5e-8)
- 19 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
Checking for strand ambiguous SNPs.
43 SNPs (50.6%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74616fb7223.tsv.gz
Summary statistics report:
- 85 rows (100% of original 85 rows)
- 85 unique variants
- 63 genome-wide significant variants (P<5e-8)
- 19 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7467f660db9.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7466d17ea06
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
Checking for strand ambiguous SNPs.
8 SNPs are strand-ambiguous alleles including 4 A/T and 4 C/G ambiguous SNPs. These will be removed
43 SNPs (50.6%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7467f660db9.tsv.gz
Summary statistics report:
- 85 rows (91.4% of original 93 rows)
- 85 unique variants
- 63 genome-wide significant variants (P<5e-8)
- 19 chromosomes
Done munging in 0.048 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 FRQ BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74678b3027f.tsv.gz
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74622125659.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e7463cbe7585
Checking for empty columns.
Non-standard mapping file detected.Making sure all entries in `Uncorrected` are in upper case.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Summary statistics report:
- 93 rows
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74622125659.tsv.gz
Summary statistics report:
- 93 rows (100% of original 93 rows)
- 93 unique variants
- 70 genome-wide significant variants (P<5e-8)
- 20 chromosomes
Done munging in 0.05 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 EAF BETA SE P
<char> <int> <int> <char> <char> <num> <num> <num> <num>
1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
Returning data directly.
Converting summary statistics to GenomicRanges.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7463c1b62bd.tsv.gz
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746e4c9cd1.tsv.gz
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74670ddcac.tsv.gz
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74629334cc3.tsv.gz
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74625480ef4.tsv.gz
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7467cb38af9.tsv.gz
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7464bc76b70.tsv.gz
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74624ae1cad.tsv.gz
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746c078d5f.tsv.gz
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74629316129.tsv.gz
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7463fbcb2e0.tsv.gz
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74629d137f5.tsv.gz
Using local VCF.
bgzip-compressing VCF file.
Finding empty VCF columns based on first 10,000 rows.
Dropping 1 duplicate column(s).
1 sample detected: EBI-a-GCST005647
Constructing ScanVcfParam object.
VCF contains: 39,630,630 variant(s) x 1 sample(s)
Reading VCF file: single-threaded
Converting VCF to data.table.
Expanding VCF first, so number of rows may increase.
Dropping 1 duplicate column(s).
Checking for empty columns.
Unlisting 3 columns.
Dropped 314 duplicate rows.
Time difference of 0.1 secs
VCF data.table contains: 101 rows x 11 columns.
Time difference of 0.4 secs
Renaming ID as SNP.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
No INFO (SI) column detected.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
Infer Effect Column
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P
Standardising column headers.
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P
Summary statistics report:
- 101 rows
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
2 SNPs (2%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74629d137f5.tsv.gz
Summary statistics report:
- 101 rows (100% of original 101 rows)
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Done munging in 0.063 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 END FILTER FRQ BETA LP
<char> <int> <int> <char> <char> <int> <char> <num> <num> <num>
1: rs58108140 1 10583 G A 10583 PASS 0.1589 0.0312 0.369267
2: rs806731 1 30923 G T 30923 PASS 0.7843 -0.0114 0.126854
3: rs116400033 1 51479 T A 51479 PASS 0.1829 0.0711 1.262410
4: rs146477069 1 54421 A G 54421 PASS 0.0352 -0.0240 0.112102
SE P
<num> <num>
1: 0.0393 0.42730011
2: 0.0353 0.74669974
3: 0.0370 0.05464998
4: 0.0830 0.77249913
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74641717dea.tsv.gz
Using local VCF.
bgzip-compressing VCF file.
Finding empty VCF columns based on first 10,000 rows.
Dropping 1 duplicate column(s).
1 sample detected: EBI-a-GCST005647
Constructing ScanVcfParam object.
VCF contains: 39,630,630 variant(s) x 1 sample(s)
Reading VCF file: single-threaded
Converting VCF to data.table.
Expanding VCF first, so number of rows may increase.
Dropping 1 duplicate column(s).
Checking for empty columns.
Unlisting 3 columns.
Dropped 314 duplicate rows.
Time difference of 0.1 secs
VCF data.table contains: 101 rows x 11 columns.
Time difference of 0.4 secs
Renaming ID as SNP.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
No INFO (SI) column detected.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
Infer Effect Column
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P
Standardising column headers.
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P
Summary statistics report:
- 101 rows
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for correct direction of A1 (reference) and A2 (alternative allele).
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 101 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 13 seconds.
There are 1 SNPs where A1 doesn't match the reference genome.
These will be flipped with their effect columns.
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicate SNPs from SNP ID.
Found 10 Indels. These won't be checked for duplicates based on RS ID as there can be multiples.
WARNING If your sumstat doesn't contain Indels, set the indel param to FALSE & rerun MungeSumstats::format_sumstats()
Checking for SNPs with duplicated base-pair positions.
Found 10 Indels. These won't be checked for duplicates based on base-pair position as there can be multiples.
WARNING If your sumstat doesn't contain Indels, set the indel param to FALSE & rerun MungeSumstats::format_sumstats()
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
Checking for bi-allelic SNPs.
2 SNPs (2%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74641717dea.tsv.gz
Summary statistics report:
- 101 rows (100% of original 101 rows)
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Done munging in 0.301 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 END FILTER FRQ BETA LP
<char> <int> <int> <char> <char> <int> <char> <num> <num> <num>
1: rs58108140 1 10583 G A 10583 PASS 0.1589 0.0312 0.369267
2: rs806731 1 30923 G T 30923 PASS 0.7843 -0.0114 0.126854
3: rs116400033 1 51479 T A 51479 PASS 0.1829 0.0711 1.262410
4: rs146477069 1 54421 A G 54421 PASS 0.0352 -0.0240 0.112102
SE P
<num> <num>
1: 0.0393 0.42730011
2: 0.0353 0.74669974
3: 0.0370 0.05464998
4: 0.0830 0.77249913
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74611d0423f.tsv.gz
Using local VCF.
bgzip-compressing VCF file.
Finding empty VCF columns based on first 10,000 rows.
Dropping 1 duplicate column(s).
1 sample detected: EBI-a-GCST005647
Constructing ScanVcfParam object.
VCF contains: 39,630,630 variant(s) x 1 sample(s)
Reading VCF file: single-threaded
Converting VCF to data.table.
Expanding VCF first, so number of rows may increase.
Dropping 1 duplicate column(s).
Checking for empty columns.
Unlisting 3 columns.
Dropped 314 duplicate rows.
Time difference of 0.1 secs
VCF data.table contains: 101 rows x 11 columns.
Time difference of 0.4 secs
Renaming ID as SNP.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
No INFO (SI) column detected.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
Infer Effect Column
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P
Standardising column headers.
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P
Summary statistics report:
- 101 rows
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
2 SNPs (2%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74611d0423f.tsv.gz
Summary statistics report:
- 101 rows (100% of original 101 rows)
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Done munging in 0.065 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 END FILTER FRQ BETA LP
<char> <int> <int> <char> <char> <int> <char> <num> <num> <num>
1: rs58108140 1 10583 G A 10583 PASS 0.1589 0.0312 0.369267
2: rs806731 1 30923 G T 30923 PASS 0.7843 -0.0114 0.126854
3: rs116400033 1 51479 T A 51479 PASS 0.1829 0.0711 1.262410
4: rs146477069 1 54421 A G 54421 PASS 0.0352 -0.0240 0.112102
SE P
<num> <num>
1: 0.0393 0.42730011
2: 0.0353 0.74669974
3: 0.0370 0.05464998
4: 0.0830 0.77249913
Returning data directly.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746764976f5.tsv.gz
Using local VCF.
bgzip-compressing VCF file.
Finding empty VCF columns based on first 10,000 rows.
Dropping 1 duplicate column(s).
1 sample detected: EBI-a-GCST005647
Constructing ScanVcfParam object.
VCF contains: 39,630,630 variant(s) x 1 sample(s)
Reading VCF file: single-threaded
Converting VCF to data.table.
Expanding VCF first, so number of rows may increase.
Dropping 1 duplicate column(s).
Checking for empty columns.
Unlisting 3 columns.
Dropped 314 duplicate rows.
Time difference of 0.1 secs
VCF data.table contains: 101 rows x 11 columns.
Time difference of 0.4 secs
Renaming ID as SNP.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
No INFO (SI) column detected.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
Infer Effect Column
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P
Standardising column headers.
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P
Summary statistics report:
- 101 rows
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
2 SNPs (2%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e746764976f5.tsv.gz
Summary statistics report:
- 101 rows (100% of original 101 rows)
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Done munging in 0.062 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 END FILTER FRQ BETA LP
<char> <int> <int> <char> <char> <int> <char> <num> <num> <num>
1: rs58108140 1 10583 G A 10583 PASS 0.1589 0.0312 0.369267
2: rs806731 1 30923 G T 30923 PASS 0.7843 -0.0114 0.126854
3: rs116400033 1 51479 T A 51479 PASS 0.1829 0.0711 1.262410
4: rs146477069 1 54421 A G 54421 PASS 0.0352 -0.0240 0.112102
SE P
<num> <num>
1: 0.0393 0.42730011
2: 0.0353 0.74669974
3: 0.0370 0.05464998
4: 0.0830 0.77249913
Returning data directly.
Converting summary statistics to GenomicRanges.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74610cdd0d.tsv.gz
Using local VCF.
bgzip-compressing VCF file.
Finding empty VCF columns based on first 10,000 rows.
Dropping 1 duplicate column(s).
1 sample detected: EBI-a-GCST005647
Constructing ScanVcfParam object.
VCF contains: 39,630,630 variant(s) x 1 sample(s)
Reading VCF file: single-threaded
Converting VCF to data.table.
Expanding VCF first, so number of rows may increase.
Dropping 1 duplicate column(s).
Checking for empty columns.
Unlisting 3 columns.
Dropped 314 duplicate rows.
Time difference of 0.1 secs
VCF data.table contains: 101 rows x 11 columns.
Time difference of 0.4 secs
Renaming ID as SNP.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
No INFO (SI) column detected.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
Infer Effect Column
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P
Standardising column headers.
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P
Summary statistics report:
- 101 rows
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
2 SNPs (2%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74610cdd0d.tsv.gz
Summary statistics report:
- 101 rows (100% of original 101 rows)
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Done munging in 0.062 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 END FILTER FRQ BETA LP
<char> <int> <int> <char> <char> <int> <char> <num> <num> <num>
1: rs58108140 1 10583 G A 10583 PASS 0.1589 0.0312 0.369267
2: rs806731 1 30923 G T 30923 PASS 0.7843 -0.0114 0.126854
3: rs116400033 1 51479 T A 51479 PASS 0.1829 0.0711 1.262410
4: rs146477069 1 54421 A G 54421 PASS 0.0352 -0.0240 0.112102
SE P
<num> <num>
1: 0.0393 0.42730011
2: 0.0353 0.74669974
3: 0.0370 0.05464998
4: 0.0830 0.77249913
Returning data directly.
Converting summary statistics to GenomicRanges.
Converting summary statistics to VRanges.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7463ee5fd47.tsv.gz
Using local VCF.
bgzip-compressing VCF file.
Finding empty VCF columns based on first 10,000 rows.
Dropping 1 duplicate column(s).
1 sample detected: EBI-a-GCST005647
Constructing ScanVcfParam object.
VCF contains: 39,630,630 variant(s) x 1 sample(s)
Reading VCF file: single-threaded
Converting VCF to data.table.
Expanding VCF first, so number of rows may increase.
Dropping 1 duplicate column(s).
Checking for empty columns.
Unlisting 3 columns.
Dropped 314 duplicate rows.
Time difference of 0.1 secs
VCF data.table contains: 101 rows x 11 columns.
Time difference of 0.4 secs
Renaming ID as SNP.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
No INFO (SI) column detected.
sumstats has -log10 P-values; these will be converted to unadjusted p-values in the 'P' column.
Infer Effect Column
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P
Standardising column headers.
First line of summary statistics file:
SNP chr BP end REF ALT FILTER AF ES LP SE P
Ensuring parameters comply with LDSC format.
Setting `compute_z=BETA` to comply with LDSC format.
Summary statistics report:
- 101 rows
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Checking for correct direction of A1 (reference) and A2 (alternative allele).
Loading SNPlocs data for build 144 on GRCH37.
Loading reference genome data.
Preprocessing RSIDs.
Validating RSIDs of 101 SNPs using BSgenome::snpsById...
BSgenome::snpsById done in 13 seconds.
Reordering so first three column headers are SNP, CHR and BP in this order.
Reordering so the fourth and fifth columns are A1 and A2.
Checking for missing data.
Checking for duplicate columns.
Checking for duplicated rows.
INFO column not available. Skipping INFO score filtering step.
Filtering SNPs, ensuring SE>0.
Ensuring all SNPs have N<5 std dev above mean.
Computing Z-score from BETA ans SE using formula: `BETA/SE`
Assigning N=1001 for all SNPs.
2 SNPs (2%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
The FRQ column was mapped from one of the following from the inputted summary statistics file:
FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.B, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, ALL_AF
As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
Sorting coordinates with 'data.table'.
Renaming A1,A2 to match LDSC format.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7463ee5fd47.tsv.gz
Summary statistics report:
- 101 rows (100% of original 101 rows)
- 101 unique variants
- 0 genome-wide significant variants (P<5e-8)
- 1 chromosomes
Done munging in 0.293 minutes.
Successfully finished preparing sumstats file, preview:
Reading header.
SNP CHR BP A1 A2 END FILTER FRQ BETA LP
<char> <int> <int> <char> <char> <int> <char> <num> <num> <num>
1: rs58108140 1 10583 A G 10583 PASS 0.1589 0.0312 0.369267
2: rs806731 1 30923 T G 30923 PASS 0.7843 -0.0114 0.126854
3: rs116400033 1 51479 A T 51479 PASS 0.1829 0.0711 1.262410
4: rs146477069 1 54421 G A 54421 PASS 0.0352 -0.0240 0.112102
SE P Z N
<num> <num> <num> <int>
1: 0.0393 0.42730011 0.7938931 1001
2: 0.0353 0.74669974 -0.3229462 1001
3: 0.0370 0.05464998 1.9216216 1001
4: 0.0830 0.77249913 -0.2891566 1001
Returning path to saved data.
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e746e845f4a.tsv.gz
Reading header.
Tabular format detected.
Importing tabular file: /tmp/RtmpOxg8AH/file16e74659484d61
Checking for empty columns.
Infer Effect Column
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE
Allele columns are ambiguous, attempting to infer direction
Can't infer allele columns from sumstats
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE
Summary statistics report:
- 93 rows
- 93 unique variants
- 20 chromosomes
Checking for multi-GWAS.
Checking for multiple RSIDs on one row.
Checking SNP RSIDs.
Checking for merged allele column.
Checking A1 is uppercase
Checking A2 is uppercase
Checking for incorrect base-pair positions
Standardising column headers.
First line of summary statistics file:
MarkerName CHR POS A1 A2 EAF Beta SE Pval
Sorting coordinates with 'data.table'.
.tsv
=== write tests ===
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74673653c35.tsv
=== read tests ===
Importing tabular file: /tmp/RtmpOxg8AH/file16e74673653c35.tsv
Checking for empty columns.
.tsv.gz
=== write tests ===
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7465b369f71.tsv.gz
=== read tests ===
Importing tabular file: /tmp/RtmpOxg8AH/file16e7465b369f71.tsv.gz
Checking for empty columns.
.tsv.bgz
=== write tests ===
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7466b912ac9.tsv.bgz
=== read tests ===
Importing tabular bgz file: /tmp/RtmpOxg8AH/file16e7466b912ac9.tsv.bgz
Checking for empty columns.
.tsv.gz
=== write tests ===
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74630c7e91e.tsv
Writing uncompressed instead of gzipped to enable tabix indexing.
Converting full summary stats file to tabix format for fast querying...
Reading header.
Ensuring file is bgzipped.
Tabix-indexing file.
Removing temporary .tsv file.
=== read tests ===
Importing tabular bgz file: /tmp/RtmpOxg8AH/file16e74630c7e91e.tsv.bgz
Checking for empty columns.
.tsv.bgz
=== write tests ===
Sorting coordinates with 'data.table'.
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74652299fe7.tsv
Writing uncompressed instead of gzipped to enable tabix indexing.
Converting full summary stats file to tabix format for fast querying...
Reading header.
Ensuring file is bgzipped.
Tabix-indexing file.
Removing temporary .tsv file.
=== read tests ===
Importing tabular bgz file: /tmp/RtmpOxg8AH/file16e74652299fe7.tsv.bgz
Checking for empty columns.
.csv
=== write tests ===
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7469649a80.csv
=== read tests ===
Importing tabular file: /tmp/RtmpOxg8AH/file16e7469649a80.csv
Checking for empty columns.
.csv.gz
=== write tests ===
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74672a19ef5.csv.gz
=== read tests ===
Importing tabular file: /tmp/RtmpOxg8AH/file16e74672a19ef5.csv.gz
Checking for empty columns.
.vcf
=== write tests ===
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
save_path suggests VCF output but write_vcf=FALSE. Switching output to tabular format (.tsv.gz).
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e7461ee30a26.tsv.gz
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e7461ee30a26.tsv.gz
=== read tests ===
Importing tabular file: /tmp/RtmpOxg8AH/file16e7461ee30a26.tsv.gz
Checking for empty columns.
.vcf.gz
=== write tests ===
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
save_path suggests VCF output but write_vcf=FALSE. Switching output to tabular format (.tsv.gz).
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74653115a56.tsv.gz
Writing in tabular format ==> /tmp/RtmpOxg8AH/file16e74653115a56.tsv.gz
=== read tests ===
Importing tabular file: /tmp/RtmpOxg8AH/file16e74653115a56.tsv.gz
Checking for empty columns.
.vcf
=== write tests ===
Sorting coordinates with 'data.table'.
Converting summary statistics to GenomicRanges.
Converting summary statistics to VRanges.
Writing in VCF format ==> /tmp/RtmpOxg8AH/file16e7464ae69bc4.vcf
=== read tests ===
Using local VCF.
bgzip-compressing VCF file.
Finding empty VCF columns based on first 10,000 rows.
1 sample detected: GWAS
Constructing ScanVcfParam object.
Reading VCF file: single-threaded
Converting VCF to data.table.
Expanding VCF first, so number of rows may increase.
Checking for empty columns.
Time difference of 0.1 secs
VCF data.table contains: 93 rows x 11 columns.
Time difference of 0.3 secs
No INFO (SI) column detected.
.vcf.gz
=== write tests ===
Sorting coordinates with 'data.table'.
Converting summary statistics to GenomicRanges.
Converting summary statistics to VRanges.
Writing in VCF format ==> /tmp/RtmpOxg8AH/file16e74632a3d0ba.vcf.gz
=== read tests ===
Using local VCF.
Finding empty VCF columns based on first 10,000 rows.
1 sample detected: GWAS
Constructing ScanVcfParam object.
Reading VCF file: single-threaded
Converting VCF to data.table.
Expanding VCF first, so number of rows may increase.
Checking for empty columns.
Time difference of 0.1 secs
VCF data.table contains: 93 rows x 11 columns.
Time difference of 0.3 secs
No INFO (SI) column detected.
.vcf
=== write tests ===
Sorting coordinates with 'data.table'.
Converting summary statistics to GenomicRanges.
Converting summary statistics to VRanges.
Writing in VCF format ==> /tmp/RtmpOxg8AH/file16e746487a17d9.vcf
.vcf
=== write tests ===
******::NOTE::******
- Formatted results will be saved to `tempdir()` by default.
- This means all formatted summary stats will be deleted upon ending the R session.
- To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
********************
Formatted summary statistics will be saved to ==> /tmp/RtmpOxg8AH/file16e74657715387.vcf.bgz
Sorting coordinates with 'data.table'.
Converting summary statistics to GenomicRanges.
Converting summary statistics to VRanges.
Writing in VCF format ==> /tmp/RtmpOxg8AH/file16e74657715387.vcf.bgz
=== read tests ===
Using local VCF.
File already tabix-indexed.
Finding empty VCF columns based on first 10,000 rows.
1 sample detected: GWAS
Constructing ScanVcfParam object.
Reading VCF file: single-threaded
Converting VCF to data.table.
Expanding VCF first, so number of rows may increase.
Checking for empty columns.
Time difference of 0.1 secs
VCF data.table contains: 93 rows x 11 columns.
Time difference of 0.3 secs
No INFO (SI) column detected.
.vcf.bgz
=== write tests ===
Sorting coordinates with 'data.table'.
Converting summary statistics to GenomicRanges.
Converting summary statistics to VRanges.
Writing in VCF format ==> /tmp/RtmpOxg8AH/file16e746305ebe07.vcf.bgz
=== read tests ===
Using local VCF.
File already tabix-indexed.
Finding empty VCF columns based on first 10,000 rows.
1 sample detected: GWAS
Constructing ScanVcfParam object.
Reading VCF file: single-threaded
Converting VCF to data.table.
Expanding VCF first, so number of rows may increase.
Checking for empty columns.
Time difference of 0.1 secs
VCF data.table contains: 93 rows x 11 columns.
Time difference of 0.3 secs
No INFO (SI) column detected.
[ FAIL 0 | WARN 5 | SKIP 0 | PASS 196 ]
[ FAIL 0 | WARN 5 | SKIP 0 | PASS 196 ]
Warning message:
In .Internal(gc(verbose, reset, full)) :
closing unused connection 4 (/tmp/RtmpOxg8AH/file16e7467dd48ccd_log_msg.txt)
>
> proc.time()
user system elapsed
880.268 88.121 984.431
##############################################################################
##############################################################################
###
### Running command:
###
### /home/biocbuild/bbs-3.22-bioc/R/bin/R CMD check --test-dir=longtests --no-stop-on-test-error --no-codoc --no-examples --no-manual --ignore-vignettes --check-subdirs=no MungeSumstats_1.17.5.tar.gz
###
##############################################################################
##############################################################################
* using log directory ‘/home/biocbuild/bbs-3.22-bioc-longtests/meat/MungeSumstats.Rcheck’
* using R version 4.5.1 Patched (2025-08-23 r88802)
* using platform: x86_64-pc-linux-gnu
* R was compiled by
gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
GNU Fortran (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
* running under: Ubuntu 24.04.3 LTS
* using session charset: UTF-8
* using options ‘--no-codoc --no-examples --no-manual --ignore-vignettes --no-stop-on-test-error’
* checking for file ‘MungeSumstats/DESCRIPTION’ ... OK
* checking extension type ... Package
* this is package ‘MungeSumstats’ version ‘1.17.5’
* package encoding: UTF-8
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for hidden files and directories ... NOTE
Found the following hidden files and directories:
.BBSoptions
These were most likely included in error. See section ‘Package
structure’ in the ‘Writing R Extensions’ manual.
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking whether package ‘MungeSumstats’ can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking code files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking whether startup messages can be suppressed ... OK
* checking dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... NOTE
checkRd: (-1) check_no_chr_bp.Rd:64-65: Lost braces
64 | \item \code{sumstats_dt}{
| ^
checkRd: (-1) check_no_chr_bp.Rd:66-67: Lost braces
66 | \item \code{rsids}{
| ^
checkRd: (-1) check_no_chr_bp.Rd:68-69: Lost braces
68 | \item \code{log_files}{
| ^
checkRd: (-1) check_on_ref_genome.Rd:73-74: Lost braces
73 | \item \code{sumstats_dt}{
| ^
checkRd: (-1) check_on_ref_genome.Rd:75-76: Lost braces
75 | \item \code{rsids}{
| ^
checkRd: (-1) check_on_ref_genome.Rd:77-78: Lost braces
77 | \item \code{log_files}{
| ^
checkRd: (-1) compute_nsize.Rd:32: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_nsize.Rd:33-36: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_nsize.Rd:37-38: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_nsize.Rd:39-40: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_nsize.Rd:41-42: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_nsize.Rd:43-44: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_sample_size.Rd:21-28: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_sample_size.Rd:30-34: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_sample_size.Rd:36-40: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_sample_size.Rd:42-46: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_sample_size.Rd:48-52: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_sample_size_n.Rd:16-23: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_sample_size_n.Rd:25-29: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_sample_size_n.Rd:31-35: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_sample_size_n.Rd:37-41: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_sample_size_n.Rd:43-47: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_sample_size_neff.Rd:21-28: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_sample_size_neff.Rd:30-34: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_sample_size_neff.Rd:36-40: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_sample_size_neff.Rd:42-46: Lost braces in \itemize; meant \describe ?
checkRd: (-1) compute_sample_size_neff.Rd:48-52: Lost braces in \itemize; meant \describe ?
checkRd: (-1) read_sumstats.Rd:29: Lost braces in \itemize; meant \describe ?
checkRd: (-1) read_sumstats.Rd:30: Lost braces in \itemize; meant \describe ?
checkRd: (-1) read_sumstats.Rd:31-32: Lost braces in \itemize; meant \describe ?
checkRd: (-1) read_vcf.Rd:64: Lost braces in \itemize; meant \describe ?
checkRd: (-1) read_vcf.Rd:65: Lost braces in \itemize; meant \describe ?
checkRd: (-1) read_vcf.Rd:66-67: Lost braces in \itemize; meant \describe ?
checkRd: (-1) read_vcf_parallel.Rd:40: Lost braces in \itemize; meant \describe ?
checkRd: (-1) read_vcf_parallel.Rd:41: Lost braces in \itemize; meant \describe ?
checkRd: (-1) read_vcf_parallel.Rd:42-43: Lost braces in \itemize; meant \describe ?
checkRd: (-1) select_vcf_fields.Rd:27: Lost braces in \itemize; meant \describe ?
checkRd: (-1) select_vcf_fields.Rd:28: Lost braces in \itemize; meant \describe ?
checkRd: (-1) select_vcf_fields.Rd:29-30: Lost braces in \itemize; meant \describe ?
checkRd: (-1) sort_coords.Rd:19-21: Lost braces in \itemize; meant \describe ?
checkRd: (-1) sort_coords.Rd:22-24: Lost braces in \itemize; meant \describe ?
* checking Rd metadata ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... SKIPPED
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking contents of ‘data’ directory ... OK
* checking data for non-ASCII characters ... OK
* checking data for ASCII and uncompressed saves ... OK
* checking R/sysdata.rda ... OK
* checking files in ‘vignettes’ ... SKIPPED
* checking examples ... SKIPPED
* checking for unstated dependencies in ‘longtests’ ... OK
* checking tests in ‘longtests’ ...
Running ‘testthat.R’
OK
* DONE
Status: 2 NOTEs
See
‘/home/biocbuild/bbs-3.22-bioc-longtests/meat/MungeSumstats.Rcheck/00check.log’
for details.
MungeSumstats.Rcheck/00install.out
* installing *source* package ‘MungeSumstats’ ... ** this is package ‘MungeSumstats’ version ‘1.17.5’ ** using staged installation ** R ** data ** inst ** byte-compile and prepare package for lazy loading ** help *** installing help indices ** building package indices ** installing vignettes ** testing if installed package can be loaded from temporary location ** testing if installed package can be loaded from final location ** testing if installed package keeps a record of temporary installation path * DONE (MungeSumstats)