SecretDataScientist.com

What is Deep Belief Network?

Deep Belief Nets are probabilistic generative models that are composed of multiple layers of stochastic, latent variables. The latent variables typically have binary values and are often called hidden units or feature detectors. The top two layers have undirected, symmetric connections between them and form an associative memory. The lower layers receive top-down, directed connections from the layer above. The … Read more

What is Decision Tree?

Decision tree is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm. Decision trees are commonly used in operations research, specifically in decision analysis, to help identify a strategy most likely to reach a goal, but … Read more

What is Data Mining?

Data Mining is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. It is an interdisciplinary subfield of computer science. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further … Read more

What is Cross-Validation?

Cross-Validation is a model validation technique for assessing how the results of a statistical analysis will generalize to an independent data set. It is mainly used in settings where the goal is a prediction, and one wants to estimate how accurately a predictive model will perform in practice. The idea is to define a dataset to “test” the model in … Read more

What is Correlation?

Correlation is a statistical measure that can show whether and how strongly pairs of variables are related. For example, height and weight are related; taller people tend to be heavier than shorter people. The relationship isn’t perfect. People of the same height vary in weight, and you can easily think of two people you know where the shorter one is … Read more

What is Convolutional Neural Network (CNN)?

Convolutional Neural Network (CNN) is made up of neurons that have learnable weights and biases. CNN are a category of Neural Networks that have proven very effective in areas such as image recognition and classification. ConvNets have been successful in identifying faces, objects and traffic signs apart from powering vision in robots and self-driving cars. CNNs transform the original image … Read more

What is collaborative filtering?

Collaborative Filtering (CF) is a technique used by recommender systems. Collaborative filtering has two senses, a narrow one, and a more general one. Collaborative filtering is a method of making automatic predictions (filtering) about the interests of a user by collecting preferences or taste information from many users (collaborating). The underlying assumption of the collaborative filtering approach is that if … Read more

What is cluster analysis?

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in one sense or another) to each other than to those in other groups (clusters). It is the main task of exploratory data mining, and a common technique for statistical data … Read more

What is classification?

Classification in machine learning and statistics, is the problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known. Classification is an example of pattern recognition. In the terminology of machine learning, classification is considered an instance of … Read more

What is Chi-squared test for variances?

Chi-squared test for variances. A chi-square test can be used to test if the variance of a population is equal to a specified value. This test can be either a two-sided test or a one-sided test. The two-sided version tests against the alternative that the true variance is either less than or greater than the specified value. The one-sided version … Read more