Training in statistical programming and data analysis
Statistical programming languages make it possible to manage datasets, produce analyzes and build elaborate charts. For example, the R language is one of the languages of choice in machine learning, it is supported by a large community of researchers and contributors around the world. The R environment is of the open source type. This training aims to provide the bases required to start analysis, programming and data mining projects in R.
Training objectives
- Perform common operations on a data table (sort, filter, select, merge, etc.)
- Import, clean, organize and export data files
- Create simple charts to visualize the data
- Write reusable custom functions
- Know how to search for help in the R community
Target audience
- Anyone wishing to use a programming language to manipulate data and generate reports.
- This course requires knowledge of the basics of computer programming, regardless of the language.
Format
- Virtual or in-person
- 60% theory and 40% practical (bring your dataset for exercises)
- Duration: 16 hours (or 4x 4 hours)
Content
- The R environment and RStudio
- Types of variables
- Vectors, matrices and dataframes, selection of subsets
- Basic operations and logic
- Structure of if-else statements and for-while loops
- Reusable custom function structure
- Concept of vectorized function and implicit loop
- How to find help on the web
- Practical workshops and exercises: importing a data file; obtain descriptive statistics; build simple charts; do a Pareto analysis; write a custom function and call it; copy and paste results and charts to other software
Requirements
- Be familiar with computer tools in general
- Ideally have use cases in mind
- Have installed the R software
- Have installed the free version of RStudio
This training is pratical, theoretical elements are very limited. For more informations or to book your training, please contact us!