mit-15-003-data-science-tools

Study guides for MIT's 15.003 Data Science Tools
授权协议 MIT License
开发语言 SHELL
所属分类 应用工具、 终端/远程登录
软件类型 开源软件
地区 不详
投 递 者 叶国兴
操作系统 跨平台
开源组织
适用人群 未知
 软件概览

Data Science Tools study guides for MIT's 15.003

Goal

This repository aims at summing up in the same place all the important notions that are covered in MIT's 15.003 Data Science Tools course and includes:

  • Study guides for SQL, R, Python, Git and Bash.
  • Conversion guides between R and Python.
  • All elements of the above combined in an ultimate compilation of concepts, to have with you at all times!

Content

Study guides

Illustration Illustration Illustration
Data retrieval with SQL Data manipulation with R Data manipulation with Python
Illustration Illustration Illustration
Visualization with R Visualization with Python Engineering tips

Conversion guides between R and Python

Illustration Illustration
Data manipulation Visualization

Super study guide

Illustration
All the above gathered in one place

Website

This material is also available on a dedicated website, so that you can enjoy reading it from any device.

Authors

Afshine Amidi (Ecole Centrale Paris, MIT) and Shervine Amidi (Ecole Centrale Paris, Stanford University)

  • 先发于我的独立博客:译文-Teach Yourself Computer Science-自学计算机科学 英:Teach Yourself Computer Science 注: 所有内容大部分使用google翻译得到,我对其中翻译完全不对的地方做了修改,整体读起来会很奇怪,因为它是英语的句式,但不影响对信息的获取。 对于文中的书,添加了豆瓣的链接,如果有中译本的话也给出了中译本的豆瓣链接。 目录

  • As with any burgeoning technology that enjoys commercial attention, the use of data mining is surrounded by a great deal of hype. Exaggerated reports tell of secrets that can be uncovered by setting a

  • MIT6_0001F16_Pset2 完成Hangman Game的编写,就是一个猜词游戏 # Problem Set 2, hangman.py # Name: Ding # Collaborators: # Time spent: # Hangman Game import random #用于随机性抽取单词 import string WORDLIST_FILENAME

  • exercise1.11 对我这种汇编很差的人来说,这个练习也太难了点吧,一卡卡了我两天半。。。。先是看那个read_ebp()“会很有用”,于是去研究了内联汇编格式,一直看不懂,花了一天才知道这个内联汇编函数是在做什么;之后又去搜汇编函数调用栈过程,才知道栈中那一条条指令如何算,如何看;然后又复习了一下gdb汇编的单步调试命令,用这个一步步调,跟踪栈的变化,才理解了这个五层调用函数栈的变化过程;

  • exercise 1.10 要求:在obj/kern/kernel.asm中找到test_backtrace函数,设置一个断点,检查内核启动后每次该函数被调用时发生了什么。多少个32位字在每次嵌套调用test_backtrace时被压入栈?这些被压栈的字是什么? 解答: 查询被调用过程,最后指向了一个C文件,lab/kern/init.c 其中有该函数的代码。 test_backtrace(int

  • 前言 使用管道编写prime的并发版本。这个想法源于Unix管道的发明者道格·麦克洛伊。下面说明了如何实现。您的解决方案应该在文件user/primes.c中。 您的目标是使用pipe和fork来设置管道。第一个过程将数字2到35送入管道。对于每个素数,您将创建一个进程,该进程通过一个管道从其左侧邻居读取数据,并通过另一个管道向其右侧邻居写入数据。由于xv6具有有限数量的文件描述符和进程,因此第一

  • exercise 1.5 要求:如果链接地址是错的,追踪boot loader代码会做什么。将boot/MakeFrag文件中的链接地址改成一个错误值,重新编译,追踪boot loader执行,看看会发生什么。不要忘记做完后将正确的链接地址改回去。 我把链接地址改成了0x7cc0,结过qemu窗口出现了如下错误信息。 qemu-system-i386 -drive file=obj/kern/ke

  • 课程主要目标 A basic database system • SQL Front-end (Provided for later labs) – Heap files (Lab 1) – Buffer Pool (Labs 1-6) – Basic Operators (Labs 1 & 2) –Scan, Filter, JOIN, Aggregate – Query optimizer (

  • exercis 1.2 The target architecture is assumed to be i8086 [f000:fff0] 0xffff0: ljmp $0xf000,$0xe05b 0x0000fff0 in ?? () + symbol-file obj/kern/kernel (gdb) si [f000:e05b] 0xfe05b: cmpl $0x0

  • exercise 1.1 要求熟悉一下6.828这门课的参考文献,为以后阅读和写汇编代码打基础。 注意特别提及的那片参考文献,讲解了两种汇编语言格式的不同之处。下面是这篇引文的小总结。 AT&T格式语法采用了一种独特的内联汇编技巧,下面是AT&T格式与Intel格式的不同之处。 编译器命名:采用前缀% AT&T:%eax Intel:eax 源操作数/目的操作数顺序:AT&T格式中源操作数总是在左

  • MIT 6.031 Reading 1: Static Checking Types primitive types: int long boolean double char object types: String BigInteger Static Typing Java is a statically-typed language static checking Support for s

  • 课程主要讨论的: 8 modules Algorithmic Thinking : Peak Finder Sorting & Trees : Event Simulation Hashing : Genome Compasions Numeric : RSA Encryption Graphs: Rubicks Cube Shortest Path: example Cal Tech -> MI

 相关资料
  • 数据是新的石油。 该声明显示了如何通过捕获,存储和分析满足各种需求的数据来驱动每个现代IT系统。 无论是为商业做出决定,预测天气,研究生物学中的蛋白质结构还是设计营销活动。 所有这些情景都涉及使用数学模型,统计数据,图表,数据库以及数据分析背后的商业或科学逻辑的多学科方法。 因此,我们需要一种能够满足数据科学所有这些不同需求的编程语言。 Python就像一种语言一样闪亮,因为它拥有众多的库和内置功

  • Complete-Data-Science-Toolkits The overall objective of this toolkit is to provide and offer a free collection of data analysis and machine learning that is specifically suited for doing data science.

  • Data Science Learning Repository of code, resources and utilities related to different data science and machine learning topics. For learning, practicing and teaching purposes. Utils can be installed

  • DATA-SCIENCE-BOWL-2018 Find the nuclei in divergent images to advance medical discovery Spot Nuclei. Speed Cures. Imagine speeding up research for almost every disease, from lung cancer and heart dise

  • Best Data Science Resources Hey, Data Enthusiasts out there! Finally, after lots of requests from the community I finally came up with the best free Data Science Resouces which will equip you with all

  • Data Science Collected Resources A trove of carefully curated resources and links (on the topics of software, platforms, language, techniques, etc.) related to data science, all in one place. Please f