Skip to content

sarkaftghareeb/Sarkaft-DeepChem

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Sarkaft-DeepChem as exmaple

Developing a Model(regression model) to train model using a library of 2,431 siRNA molecules, each 21 bases long. Every one of them has been tested experimentally and labeled with a value between 0 and 1, indicating how effective it is at silencing its target gene. Small values indicate ineffective molecules, while larger values indicate more effective ones. The model Takes the sequence as input and tries to predict the effectiveness. The predictive model for siRNA effectiveness offers a powerful tool for advancing scientific research and therapeutic development. By providing accurate predictions of siRNA activity, it supports the efficient and cost-effective design of gene-silencing molecules, ultimately contributing to advancements in gene therapy, functional genomics, and personalized medicine. The results from this model can be used to prioritize experimental efforts, deepen our understanding of RNA interference mechanisms, and accelerate the development of new treatments for genetic diseases. import tensorflow as tf import deepchem as dc from deepchem.models import layers

Initialize the model

model = dc.models.TensorGraph()

Define the input features and labels

features = layers.Feature(shape=(None, 21, 4)) labels = layers.Label(shape=(None, 1))

Add convolutional layers with dropout

prev = features for i in range(2): prev = layers.Conv1D( filters=10, kernel_size=10, activation=tf.nn.relu, padding='same', in_layers=prev ) prev = layers.Dropout(dropout_prob=0.3, in_layers=prev)

Flatten the output from convolutional layers

flattened = layers.Flatten(prev)

Add a dense layer with sigmoid activation

output = layers.Dense( out_channels=1, activation_fn=tf.sigmoid, in_layers=flattened )

Add the output layer to the model

model.add_output(output)

Define the loss function

loss = layers.ReduceMean(layers.L2Loss(in_layers=[labels, output])) model.set_loss(loss) import deepchem as dc

Load the training and validation datasets

train_dataset = dc.data.DiskDataset('train_siRNA') valid_dataset = dc.data.DiskDataset('valid_siRNA')

Define the evaluation metric

metric = dc.metrics.Metric(dc.metrics.pearsonr, mode='regression')

Train the model for 20 epochs

for epoch in range(20): print(f'Epoch {epoch + 1}/20') model.fit(train_dataset, nb_epoch=1)

# Evaluate the model on the training and validation datasets
train_scores = model.evaluate(train_dataset, [metric])
valid_scores = model.evaluate(valid_dataset, [metric])

print(f'Training Pearson Correlation: {train_scores[metric.name]:.4f}')
print(f'Validation Pearson Correlation: {valid_scores[metric.name]:.4f}')

C:\Users\sarkaft\Downloads\pearson_correlation_plot.png

##How the Model Can Be Used: siRNA Design and Optimization: Sequence Screening: Researchers can input potential siRNA sequences into the model to predict their effectiveness, allowing them to focus on the most promising candidates. Optimization: The model can help in optimizing siRNA sequences by identifying which modifications could enhance their gene-silencing activity.

Biological Research: Mechanistic Studies: The model can aid in studying the mechanisms of RNA interference by highlighting sequence features associated with high effectiveness. Functional Genomics: It can be used in functional genomics studies to identify key regulatory genes and pathways influenced by effective siRNA molecules. Drug Development:

Target Validation: Pharmaceutical companies can use the model to validate potential drug targets by designing siRNAs that effectively silence genes of interest. Therapeutic Development: The model can assist in developing siRNA-based therapeutics, including identifying sequences with high on-target and low off-target effects. Integration with Experimental Workflows:

Pre-Screening Tool: Incorporate the model as a pre-screening tool in experimental workflows to prioritize siRNA candidates for further validation in cell culture or animal models. Hybrid Approaches: Combine computational predictions with experimental data to iteratively improve the accuracy and reliability of siRNA designs.

About

Developing a Model to Forecast Molecular Toxicity

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published