Saturday, November 24, 2012
Finals Week Looms
Hope you all had a fantastic Thanksgiving holiday! Black Friday is over... this can only mean one thing to all college students. Finals week is almost here! Let's all start preparing NOW! Don't let finals creep up on you and end up cramming at the last minute. It never works in your favor. Even 30 minutes a day beginning today, will help. Procrastination is the enemy- let's rock these last few weeks! Happy studying.
Monday, November 19, 2012
Project Reference- NLP and indexing
Proc AMIA Symp. 2001: 17–21.
PMCID: PMC2243666
Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.
This article has been cited by other articles in PMC.
Abstract
The
UMLS Metathesaurus, the largest thesaurus in the biomedical domain,
provides a representation of biomedical knowledge consisting of concepts
classified by semantic type and both hierarchical and non-hierarchical
relationships among the concepts. This knowledge has proved useful for
many applications including decision support systems, management of
patient records, information retrieval (IR) and data mining. Gaining
effective access to the knowledge is critical to the success of these
applications. This paper describes MetaMap, a program developed at the
National Library of Medicine (NLM) to map biomedical text to the
Metathesaurus or, equivalently, to discover Metathesaurus concepts
referred to in text. MetaMap uses a knowledge intensive approach based
on symbolic, natural language processing (NLP) and computational
linguistic techniques. Besides being applied for both IR and data mining
applications, MetaMap is one of the foundations of NLM's Indexing
Initiative System which is being applied to both semi-automatic and
fully automatic indexing of the biomedical literature at the library.
Full text
Full text is available as a scanned copy of the original print version. Get a printable copy (PDF file) of the complete article (806K), or click on a page image below to browse page by page. Links to PubMed are also available for Selected References.
Sunday, November 18, 2012
1000 Genomes Project-Exonic Sequencing Datasets
Copy Number Variation detection from 1000 Genomes project exon capture sequencing data
Jiantao Wu, Krzysztof R Grzeda, Chip Stewart, Fabian Grubert, Alexander E Urban, Michael P Snyder and Gabor T Marth
BMC Bioinformatics 2012, 13:305 doi:10.1186/1471-2105-13-305
Published: 17 November 2012
Published: 17 November 2012
Abstract (provisional)
Background
DNA capture technologies combined with high-throughput sequencing now enable cost-effective,
deep-coverage, targeted sequencing of complete exomes. This is well suited for SNP
discovery and genotyping. However there has been little attention devoted to Copy
Number Variation (CNV) detection from exome capture datasets despite the potentially
high impact of CNVs in exonic regions on protein function.
Results
As members of the 1000 Genomes Project analysis effort, we investigated 697 samples
in which 931 genes were targeted and sampled with 454 or Illumina paired-end sequencing.
We developed a rigorous Bayesian method to detect CNVs in the genes, based on read
depth within target regions. Despite substantial variability in read coverage across
samples and targeted exons, we were able to identify 107 heterozygous deletions in
the dataset. The experimentally determined false discovery rate (FDR) of the cleanest
dataset from the Wellcome Trust Sanger Institute is 12.5%. We were able to substantially
improve the FDR in a subset of gene deletion candidates that were adjacent to another
gene deletion call (17 calls). The estimated sensitivity of our call-set was 45%.
Conclusions
This study demonstrates that exonic sequencing datasets, collected both in population
based and medical sequencing projects, will be a useful substrate for detecting genic
CNV events, particularly deletions. Based on the number of events we found and the
sensitivity of the methods in the present dataset, we estimate on average 16 genic
heterozygous deletions per individual genome. Our power analysis informs ongoing and
future projects about sequencing depth and uniformity of read coverage required for
efficient detection.
The complete article is available as a provisional PDF. The fully formatted PDF and HTML versions are in production.
Saturday, November 17, 2012
Research: Hurry up and wait!
This has been one of those slow weeks-research often moves at a snails pace. While waiting for responses from clinical labs this week, I had time to reflect on the life cycle of research. It isn't always glamorous.
The fun begins once a research topic has been defined and a hypothesis formulated. Getting organized and defining fuzzy concepts that allow them to be measured, qualitatively and quantitatively sets the stage for the journey ahead. For me, creating a statistical data set ramps up the excitement of a new project. Things really heat up once an experiment is launched. And then....the waiting begins. Results can't be forced, so time takes control of the process. I am impatient. I hate waiting. I sent second requests to 77 labs with a deadline of Nov 30th-updated the database- prepared a rough outline of my paper...the boring stuff. I suppose this is the calm before the storm. I hope to have more next week...
Thursday, November 15, 2012
This is it!!!! My Dream Grad level Research project
Improving accuracy for cancer classification with a new algorithm for genes selection
Hongyan Zhang, Haiyan Wang, Zhijun Dai, Ming-shun Chen and Zheming Yuan
For all author emails, please log on.
BMC Bioinformatics 2012, 13:298 doi:10.1186/1471-2105-13-298
Published: 13 November 2012
Published: 13 November 2012
Abstract (provisional)
Background
Even though the classification of cancer tissue samples based on gene expression data
has advanced considerably in recent years, it faces great challenges to improve accuracy.
One of the challenges is to establish an effective method that can select a parsimonious
set of relevant genes. So far, most methods for gene selection in literature focus
on screening individual or pairs of genes without considering the possible interactions
among genes. Here we introduce a new computational method named the Binary Matrix
Shuffling Filter (BMSF). It not only overcomes the difficulty associated with the
search schemes of traditional wrapper methods and overfitting problem in large dimensional
search space but also takes potential gene interactions into account during gene selection.
This method, coupled with Support Vector Machine (SVM) for implementation, often selects
very small number of genes for easy model interpretability.
Results
We applied our method to 9 two-class gene expression datasets involving human cancers.
During the gene selection process, the set of genes to be kept in the model was recursively
refined and repeatedly updated according to the effect of a given gene on the contributions
of other genes in reference to their usefulness in cancer classification. The small
number of informative genes selected from each dataset leads to significantly improved
leave-one-out (LOOCV) classification accuracy across all 9 datasets for multiple classifiers.
Our method also exhibits broad generalization in the genes selected since multiple
commonly used classifiers achieved either equivalent or much higher LOOCV accuracy
than those reported in literature.
Conclusions
Evaluation of a gene's contribution to binary cancer classification is better to be
considered after adjusting for the joint effect of a large number of other genes.
A computationally efficient search scheme was provided to perform effective search
in the extensive feature space that includes possible interactions of many genes.
Performance of the algorithm applied to 9 datasets suggests that it is possible to
improve the accuracy of cancer classification by a big margin when joint effects of
many genes are considered.
Alabama State University- Center for NanoBiotechnology Research
Center for NanoBiotechnology Research
Main Content
1) Structural studies of RSV at nanoscale and viral inhibition by nanoparticles,
2) Carbon nanotube Attached with ssDNA as nanosensor for detection of Salmonella Typhimurium,
3) The Development of Nanobiomaterials for Drug Delivery, and
4) Delivery of Nanoparticle Encapsuled AntiChlamydial Peptides in an Animal Model.The CNBR aims to:
These projects are carried out in collaboration with the University of Louisville, University of South Florida, University of Alabama at Birmingham and Tulane National Primate. The center includes an international scientist exchange program in collaboration with China, India, Argentina, Singapore and Japan. The CNBR aims to:
- Communicate with the community
- Perform world-class research in nanobiotechnology
- Provide educational opportunities to minority students
- Develop new curricula and programs in nanobiotechnology
- Enhance industry-institutional research and commercialize research products
Wednesday, November 14, 2012
Tuesday, November 13, 2012
Friday, November 9, 2012
Hi All!
My name is Deeda Webster, I am interested in the field of Bioinformatics. Currently, I am enjoying an unpaid internship at Arizona Strategic Enterprise Technology. The research project I've been assigned involves a survey of all CLIA accredited clinical laboratories in the State of Arizona, there are about 587. I am currently seeking to identify the capability of these laboratories to electronically send and receive structured laboratory results. Standardized nomenclature use by laboratories, such as the Logical Observation Identifiers Names and Codes (LOINC) is another major focus of my research . At the conclusion of the project I will analyze the data and create statistical summary and comparison reports to identify laboratory's barriers to interoperability and standardization . It has been my good fortune to meet with Arizona DHS epidemiologists to discuss the future of syndromic surveillance. Fascinating!
Subscribe to:
Posts (Atom)