https://secretdatascientist.com

Trading with Python Intro – Data Import

Traditionally, there have been two general ways of analyzing market data: fundamental analysis – focused on underlying fundamental data technical analysis – focused on charts and price movements In recent years, computer science and mathematics revolutionized trading, it has become...

Read More
https://secretdatascientist.com

How would you validate-test a predictive model?

How would you validate-test a predictive model? Why evaluate/test model at all? Evaluating the performance of a model is one of the most important stages in predictive modeling, it indicates how successful model has been for the dataset. It enables...

Read More
https://secretdatascientist.com

Introduction to TensorFlow

Introduction to TensorFlow. What is TensorFlow? The shortest definition would be, TensorFlow is a general-purpose library for graph-based computation. But there is a variety of other ways to define TensorFlow, for example, Rodolfo Bonnin in his book – Building Machine...

Read More
https://secretdatascientist.com

Where to learn TensorFlow for Free?

Below a list of free resources to learn TensorFlow: TensorFlow website: www.tensorflow.org Udacity free course: www.udacity.com Google Cloud Platform: cloud.google.com Coursera free course: www.coursera.org Machine Learning with TensorFlow by Nishant Shukla : www.tensorflowbook.com ‘First Contact With TensorFlow’ by Prof. JORDI TORRES: jorditorres.org  or you...

Read More
https://secretdatascientist.com

Tensor Flow Cheat Sheet.

TensorFlow Quick Reference Table – Cheat Sheet. TensorFlow is very popular deep learning library, with its complexity can be overwhelming especially for new users. Here is a short summary of often used functions, if you want to download it in...

Read More
https://secretdatascientist.com

Popular Pandas snippets used in data analysis.

Popular Pandas snippets used in data analysis. Pandas is very popular Python library for data analysis, manipulation, and visualization, I would like to share my personal view on the list of most often used functions/snippets for data analysis. 1.Import Pandas...

Read More
https://secretdatascientist.com

Numerai – deep learning example code.

In a previous post on Numerai, I have described very basic code to get into a world of machine learning competitions. This one will be a continuation, so if you haven’t read it I recommend to do it- here. In...

Read More
https://secretdatascientist.com

Intro to Machine Learning

What is a definition of Machine Learning? Machine Learning subfield of science that provides computers with the ability to learn without being explicitly programmed.   The goal of Machine Learning is to develop learning algorithms that do the learning automatically without...

Read More

DATA SCIENCE QUESTIONS AND ANSWERS

https://secretdatascientist.com

What is Hadoop YARN?

Hadoop YARN is the architectural center of Hadoop that allows multiple data processing engines such as interactive SQL, real-time streaming,...

Read More
https://secretdatascientist.com

What is Apache Kafka?

Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The...

Read More
https://secretdatascientist.com

What is Hadoop Zookeeper?

Hadoop Zookeeper is an open source Apache™ project that provides a centralized infrastructure and services that enable synchronization across a...

Read More
https://secretdatascientist.com

What is Hadoop Hbase?

Hadoop Hbase is a column-oriented database management system that runs on top of HDFS. It is well suited for sparse...

Read More
https://secretdatascientist.com

What is Hadoop Sqoop?

Hadoop Sqoop efficiently transfers bulk data between Apache Hadoop and structured datastores such as relational databases. Sqoop helps offload certain...

Read More
https://secretdatascientist.com

What is Hadoop Hive?

Hadoop Hive is a runtime Hadoop support structure that allows anyone who is already fluent with SQL (which is commonplace...

Read More
https://secretdatascientist.com

What is Hadoop Pig?

Hadoop Pig was initially developed at Yahoo to allow people using Hadoop to focus more on analyzing large datasets and...

Read More
https://secretdatascientist.com

What is Type II Error?

Type II Error in statistical hypothesis testing is incorrectly retaining a false null hypothesis (a “false negative”). A type II...

Read More
https://secretdatascientist.com

What is Type I Error?

Type I Error in statistical hypothesis testing is the incorrect rejection of a true null hypothesis (a false positive). More...

Read More
https://secretdatascientist.com

What is Sentiment Analysis?

Sentiment Analysis refers to the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract,...

Read More