Blood-Based Gene Expression in children with Autism spectrum disorder

1Department of Pediatrics and Medical Genetics, Medical University, Plovdiv Vasil Aprilov 15-A Str., Plovdiv 4002, Bulgaria 2University Hospital “St. George” Plovdiv, 66 Peshtersko Shuse Str., Plovdiv 4000, Bulgaria 3Psychiatric ward for active treatment of men, State Phsychiatry Hospital Pazardzhik, 28 Bolnichna Str. 4400 Pazardzhik, Bulgaria 4Department of Plant Phisyology and Molecular Biology, University of Plovdiv “Paisii Hilendarski”, 24 Tzar Assen Str., Plovdiv, Bulgaria


INTRODUCTION
Autism spectrum disorder (ASD) is an entity that reflects a scientific consensus concerning several previously separated disorders, actually based on current definition a single spectrum disorder with different levels of symptom severity in two core domains -deficits in social communication and interaction, and restricted repetitive behaviors [1].According to the Center for Disease Control (CDC) and Autism and Developmental Disabilities Monitoring (ADDM) Network in the U.S. 1 in 88 children are identified with the ASD [2] although this increase might merely reflect improved diagnostic means.The etiology of ASD is mainly attributable to genetic factors.The estimated heritability of ASD is more than 90%, and the genetic basis of ASD are heterogeneous and complex, involving multiple genes, gene-gene interactions, and gene-environmental interactions [3].Identification of genetic basis can shed a light on etiology and pathogenesis of this disorder which still remain elusive.The genetic risk factors for ASD identified so far range from common variants conferring a small clinical effect to rare mutations that confer a high clinical effect [4].Several genes have been found to be associated with ASD, but more genes Gene expression in children with ASD remain to be discovered.
Microarray-based gene expression profiling is a useful technology allowing simultaneous measurement of hundreds to thousands of gene transcripts, which is useful for large-scale gene discovery studies [5].This technology has been used in several post-mortem brain studies of psychiatric disorders, including schizophrenia, bipolar disorder, and autism [6].Since there are several limitations to the use of the post-mortem brain tissue in gene expression studies, and the study of fresh brain tissue from living psychiatric patients is impractical at the present time, several studies have reported the use of peripheral blood cells and lymphoblastoid cell lines (LCL) as surrogates for brain tissue.[7].In addition, a moderate correlation of gene expression between peripheral blood cells and brain tissue in humans has been reported, supporting the usefulness of peripheral blood cells in the gene expression studies for psychiatric research [8].
The above-mentioned findings attracted our interest in identifying novel differentially expressed genes associated with ASD in our population using comparative gene expression profiling based on the next generation sequencing analysis of peripheral blood samples.

RESULTS
Based on the conducted expression analysis we identified 23 differentially expressed genes with statistical significant change (p <0.01, FDR <0.001 and fold change >1), (Table 1).

KEGG pathway enrichment analisys of DEGs
Genes usually interact with each other to play roles in certain biological functions.Pathway-based analysis helps to further understand genes biological functions.We use the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database, which is the major public pathway-related database, [9] to extract the relations between ASD and the affected biological pathways.Pathway enrichment analysis identifies significantly enriched metabolic pathways and signal transduction pathways in DEGs comparing with the whole genome background.We carried out a pathwaybased analysis to identify the risk pathways of ASD.The most significant 21 KEGG signaling pathways for both upand down regulated gene sets in the significance order (size of P values) are listed in Table 2.

DISSCUSSION
The data show differences in gene expression in whole blood of children with autism compared to typically developing children matched for age and gender to the general population.Whole-transcriptome sequencing of pools of all children with autism compared to controls gives comprehensive information on a large number of significantly differentially expressed genes, suggesting that some aspect of "autism" is shared by most of the children sampled.Pathway analyses based on nominally significant genes revealed 21 KEGG pathways, significantly changed in ASD children compared to controls (nominal P-value < 0.05), involved in signaling, immune system and metabolic processes.Our findings suggest that synaptic plasticity, disturbances in axon guidance, neuronal survival, differentiation and inflammation have great role in the pathophysiology of ASD.We briefly discuss some of the most significant pathways below.

Calcium signaling pathway
Dysregulation of calcium signaling pathway is interesting since recent collaborative effort of meta-analysis of five major neuropsychiatric disorders including ASD suggests that two calcium channel coding genes -CACNA1C and CACNB2 -are significantly associated with all five diseases [10].Calcium signals culminate by initiating the fusion of synaptic vesicles into the neuronal presynaptic membrane, and the sum of the evidence is that pre-and postsynaptic membrane changes recognized in several fundamental processes of synaptic plasticity and learning are calcium sensitive and are altered in models of ASD.Recent findings in the etiopatogenesis of the ASD-causing Angelman, Prader-Willi, Rett, tuberous sclerosis suggest an inability of neurons to generate adaptive responses via calcium-regulated gene expression [11].

MAPK signaling pathway
Dysregulation in Ras/mitogen-activated protein kinase (Ras/ MAPK) pathway genes lead to a class of disorders known as RASopathies which includes neurofibromatosis type 1 (NF1), Costello syndrome (CS), Noonan syndrome (NS), and cardio-facio-cutaneous syndrome (CFC).Potential genetic and phenotypic overlap between dysregulation of Ras/MAPK signalling and ASD were suggested in previous studies.Higher prevalence and severity of ASD traits in RASopathies compared to unaffected siblings propose that dysregulation of Ras/MAPK signalling during development may be implicated in ASD risk [12].

Wnt signaling pathway
The Wnt pathway is involved in various well-known cellular processes including differentiation and migration, especially during nervous system development and cell proliferation.Given its various functions, dysfunction of the canonical Wnt pathway probably exert adverse effects on neurodevelopment and therefore leads to the pathogenesis of autism.
In the last years an increased amount of evidence has shown that components of Wnt pathways are involved in major psychiatric disorders.A literature review supports the contention that modification of genes affecting the activity of the Wnt pathway could contribute to individual forms of ASD [13].

Glutamatergic and GABAergic synapse
Critical balance between excitatory glutamate and inhibitory GABA neurotransmitter is essential and crucial for proper development and functioning of brain.GABAergic (gamma aminobutyric acid) and glutamatergic interneurons maintain excitability, integrity and synaptic plasticity.Several recent evidences implicated relative loss of inhibitory GABA with corresponding glutamate mediated hyper-excitation in development of ASD.Moreover, several studies have demonstrated the imbalance of excitatory/inhibitory neurotransmitters resulting from neurodevelopmental impairments in glutamatergic and GABAergic system, which might resemble a common pathological mechanism underlying developmental disorders [14].
In summary, our list of differentially regulated genes is enriched with pathways associated with nervous system development and function, and immune system and most of them seem to be around core networks such as those involved in kinase and/or signaling networks.Therefore, our results support the involvement of various genetic factors (heterogeneity) in the development of ASD, while suggesting these different factors can be converging at, or diverging from central networks such as signaling networks.
Findings from our current study demonstrate that there are clear and significant abnormalities in the gene expression of peripheral blood samples obtained from children with ASD compared to healthy controls.This promising work, while far from being definitive, gives further proof to the recently emerging principle that peripheral blood is a potentially useful source of diagnostic biomarkers for disorders of the brain and other inaccessible tissues [15].If the results of this work are confirmed in future studies and the identified changes in the study group are individually validated by us or by others in other independent cohorts, we can assume that the differentially expressed genes may help clarify the etiology and pathogenesis of this disorder and the dysregulated pathways may also provide targets for the experimental treatments in ASD.

Subjects
A total of 30 subjects (24 -male; 6 -female) with idiopathic ASD aged 3 to 11 years (mean age of the sample -6.86 years), and 30 healthy children age and sex matched to the patient's group were recruited for gene expression analysis Participants were interviewed by experienced child psychiatrists.The diagnose ASD was established by the use of ADI-R [16], CARS or GARS [17].All of the participants were Bulgarians.None of the participants had received any medications before blood sampling.
Written informed consent and child assent were obtained from the parents and the probands, respectively, after the purposes and procedures of the study were fully explained and confidentiality was ensured.

Blood collection and RNA isolation
An aliquot of peripheral whole blood samples (2.5 ml) for each subject (ASD diagnosed and healthy controls) was collected into PAXgene blood RNA tubes (PreAnalytiX) and stored for minimum of 4 hours and then freezed at −80°C, as this method shows the biggest yield of RNA [18].After collection of all samples, total RNA was isolated using the PAXgene blood RNA kit (PreAnalytiX), according to the manufacturer's protocol.The resulting total RNA samples was treated with RNase-free DNase Gene expression in children with ASD I (Promega) according to the manufacturer's protocol followed by PCR checking for DNA contaminations.

Quantitative analysis of the isolated total RNA samples
The measured absorbance ratio of 260 nm and 280 nm (A260/A280) was between 1.93 and 2.1 for all samples included for further analysis, according to the service requirements.Pooled samples were created by adding an equivalent amount of total RNA from each individual sample to final concentration of 5 µg RNA samples.Pooled RNA samples were precipitated according to the service requirements, each pooled RNA sample was mixed with 1/10th volume of 3M NaOAc, pH 5.2 and 3 volume 100% ethanol, tо the final volume of 400 µl.Aliquots of pooled RNAs were frozen at -80°C and shiped on dry ice.RNA integrity of pooled samples (ASD and control group) was assessed by agarose gel electrophoresis (Fig. 1) and checked by Agilent 2100 Bioanalyzer (Fig. 2 and 3).

RNA-Seq (quantification) analysis
In this study we used Beijing Genomics Institute (BGI) as a Certified Service Provider for sequencing service.The total RNA samples were first treated with DNase I to degrade any possible DNA contamination.Then the mRNA was enriched by using the oligo(dT) magnetic beads.Mixed with the fragmentation buffer, the mRNA was fragmented into short fragments (about 200 bp).Then the first strand of cDNA was synthesized by using random hexamer-primer.Buffer, dNTPs, RNase H and DNA polymerase I were added to synthesize the second strand.The double strand cDNA was purified with magnetic beads.End reparation and 3'end single nucleotide A (adenine) addition was then performed.Finally, sequencing adaptors were ligated to the fragments.The fragments were enriched by PCR amplification.During the QC step, Agilent 2100 Bioanaylzer (Fig. 1 and 2).and ABI StepOnePlus Real-Time PCR System were used to qualify and quantify of the sample library.The library products were sequencing via Illumina HiSeqTM 2000.

Quantification of gene expression
The expression level for each gene was determined by the numbers of reads uniquely mapped to the specific gene and the total number of uniquely mapped reads in the sample.The gene expression level was calculated by using RPKM.[19] (Reads Per Kilobase per Million  RPKM is a method of quantifying gene expression from RNA sequencing data by normalizing for total read length and the number of sequencing reads.The RPKM method is able to eliminate the influence of different gene length and sequencing discrepancy on the calculation of gene expression level.Therefore, the RPKM values can be directly used for comparing the difference of gene expression among samples.If there is more than one transcript for a gene, the longest one is used to calculate its expression level and coverage.

Screening of differentially expressed genes (DEGs)
This analysis includes the screening of genes that are differentially expressed among samples and KEGG pathway enrichment analysis for these DEGs.Referring to "The significance of digital gene expression profiles" [20], a strict algorithm to identify differentially expressed genes between two samples have been developed.
Denote the number of unambiguous clean tags from gene A as x, given every gene's expression occupies only a small part of the library, p(x) will closely follow the Poisson distribution.P-value corresponds to differential gene expression test.FDR (False Discovery Rate) is a method to determine the threshold of P-value in multiple tests and assume that we have picked out R differentially expressed genes in which S genes really show differential expression and the other V genes are false positive.If we decide that the error ratio "Q = V / R" must stay below a cutoff (e.g.1%), we should preset the FDR to a number no larger than 0.01 [21]."FDR ≤ 0.001 and the absolute value of log2Ratio ≥ 1" as the threshold was used to judge the significance of gene expression difference.

Figure 2 .
Figure 2. Agilent 2100 Bioanalyzer results of the pooled RNA sample from ASD group (ASD).The two tall peaks at 2000 and 4000 are the ribosomal RNA.RNA integrity number (RIN) of the pooled total RNA were calculated from the bioanalyzer traces.Pooled total RNA has RNA integrity number (RIN) 7.7.

Figure 1 .
Figure 1.Аgarose gel analysis of the pooled RNA samples (Pooled samples from ASD group (ASD) and healthy controls (HC), respectively.

Figure 3 .
Figure 3. Agilent 2100 Bioanalyzer results of the pooled RNA sample from the healthy control group (HT).The two tall peaks at 2000 and 4000 are the ribosomal RNA, RNA integrity number (RIN) of the pooled total RNA were calculated from the bioanalyzer traces.Pooled total RNA has RNA integrity number (RIN) 7.9.
the real transcripts of the gene)The total clean tag number of the sample 1 is N1, and total clean tag number of sample 2 is N2; gene A holds x tags in sample 1 and y tags in sample 2. The probability of gene A expressed equally between two samples can be calculated with:

Table 1 .
Differentially expressed genes with statistically significant changes.

Table 2 .
Pathway enrichment analysis of DEGs.