You could still keep it without \, for the sake of simplicity while typing. is usually used as an expression for any character, you would need to escape with a \ to search for a literal. Grep -w "$lineage\$" adj_metadata.tsv > filtered_metadata.tsv Selected_lineages=$(echo $selected_lineages | sed 's/\./\\./g') #replace all occurrences of "." with "\." # filter data based on the selected lineages (refer to variants_lineage.txt for more info) as given below. The only modification you'll need to make to your script with this method is to add \$ at the right spot in your grep command: #!/bin/bash What you could do there is anchor the regular expression to the end of the line with $ You could of course modify your script to first match and then pipe the output through tr, but that would include unnecessary work. While method 1 would work with your input data, it would not with the adj_metadata.tsv as it is stripped of all quotes. You then simply use single quotes ' for the expression.Īs you see B.1.1.529.1 will not match in this case. csv is always between double-quotes ", you could include the quotes in your match. # Check for the uniqueness of the filtered_metadata.csv file, this should fetch the list of selected_lineagesĬut -d$'\t' -f2 filtered_metadata.tsv | sort | uniqĪny suggestions/advice are very much appreciated.Īnd also please feel free to comment on improvements that are not related to the question. Use the grep Command to Give the File Names That Have a Match in Linux. Grep -w "$lineage" adj_metadata.tsv > filtered_metadata.tsvĮcho "Adjusted metadata file does not exist." Use the grep Command to Count Number of Matches in Linux. Pattern=$(echo $|tr ' ' '|')įor lineage in "Filtering for lineage $lineage" Metadata_file="$HOME/thesis/SARS-CoV2-data/metadata.csv"Ĭat "$metadata_file" | tr -d '"' | tr ',' $'\t' > adj_metadata.tsv is a non-word character, it fetches me the results of variants that extend omicron for example, B.1.1.529.1įor elaborated discussion, please have a look at this bash script I wrote: # filter data based on the selected lineages (refer to variants_lineage.txt for more info) as given below. I tried to grep the file with -w but since. I would like to extract accession_ids along with their lineage for a few specific variants of interest, for example, Omicron i.e. I.e, the first column is a string representing the accession_id of the data sample and the second column is a covid variant lineage.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |