% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/KNNPeptide.R
\name{KNNPeptide}
\alias{KNNPeptide}
\title{K-Nearest Neighbor for Peptides (KNNPeptide)}
\usage{
KNNPeptide(seqs, trainSeq, percent = 30, label = c(), labeltr = c())
}
\arguments{
\item{seqs}{is a fasta file with amino acids sequences. Each sequence starts
with a '>' character or it is a string vector such that each element is a peptide or protein sequence.}

\item{trainSeq}{is a fasta file with amino acids sequences. Each sequence starts
with a '>' character. Also it could be a string vector such that each element is a peptide sequence. Eaxh sequence in the training set
is associated with a label. The label is found in the parameret labeltr.}

\item{percent}{determines the threshold which is used to identify sequences (in the training set) which are similar to the input sequence.}

\item{label}{is an optional parameter. It is a vector whose length is equivalent to the number of sequences. It shows the class of
each entry (i.e., sequence).}

\item{labeltr}{This parameter is a vector whose length is equivalent to the number of sequences in the training set. It shows class of
each sequence in the trainig set.}
}
\value{
This function returns a feature matrix such that number of columns is number of classes multiplied by percent and number of rows is equal to the number of the sequences.
}
\description{
This function needs an extra training data set and a label. We compute the similarity score of each input sequence with all sequences in the training data set.
We use the BLOSUM62 matrix to compute the similarity score. The label shows the class of each sequence in the training data set.
KNNPeptide finds the label of 1%...percent% of the most similar training sequence with the input sequence.
It reports the frequency of each class for each k% most similar sequences. The length of the feature vector will be percent*(number of classes).
}
\note{
This function is usable for amino acid sequences with the same length in both training data set and the set of sequences.
}
\examples{


ptmSeqsADR<-system.file("extdata/",package="ftrCOOL")
ptmSeqsVect<-as.vector(read.csv(paste0(ptmSeqsADR,"/ptmVect101AA.csv"))[,2])

posSeqs<-as.vector(read.csv(paste0(ptmSeqsADR,"/poSeqPTM101.csv"))[,2])
negSeqs<-as.vector(read.csv(paste0(ptmSeqsADR,"/negSeqPTM101.csv"))[,2])

posSeqs<-posSeqs[1:10]
negSeqs<-negSeqs[1:10]

trainSeq<-c(posSeqs,negSeqs)

labelPos<-rep(1,length(posSeqs))
labelNeg<-rep(0,length(negSeqs))

labeltr<-c(labelPos,labelNeg)

KNNPeptide(seqs=ptmSeqsVect,trainSeq=trainSeq,percent=10,labeltr=labeltr)

}
\references{
Chen, Zhen, et al. "iFeature: a python package and web server for features extraction and selection from protein and peptide sequences." Bioinformatics 34.14 (2018): 2499-2502.
}
