How to become a dialogue system engineer
Dialogue systems (conversational robots) essentially let the machine understand the human language through techniques such as machine learning and artificial intelligence. It contains a combination of many subject methods and is a concentrated training camp for artificial intelligence. Figure 1 shows the main techniques involved in the development of a dialog system. What are the relevant technologies of the dialogue system given in Figure 1, from which channels can you understand? The explanation is given step by step below. Matrix computing mainly studies some properties of a single matrix or multiple matrices. Various models of machine learning involve a lot of matrix-related properties. For example, PCA is actually calculating feature vectors, and MF is actually calculating singular value vectors in analog SVD. Many tools in the field of artificial intelligence are programmed in a matrix language, such as mainstream deep learning frameworks such as Tensorflow and PyTorch. There are a lot of textbooks for matrix calculations. Find the difficulty that suits you. If you want to understand more deeply, the book "Linear Algebra Done Right" is highly recommended. Probability statistics is the basis of machine learning. Several commonly used concepts of probability and statistics: random variables, discrete random variables, continuous random variables, probability density/distribution (binomial distribution, polynomial distribution, Gaussian distribution, index family distribution), conditional probability density/distribution, prior density / distribution, posterior density / distribution, maximum likelihood estimation, maximum posterior estimation. For a simple understanding, you can go through classic machine learning materials, such as the first two chapters of Pattern RecogniTIon and Machine Learning, the first two chapters of Machine Learning: A ProbabilisTIc PerspecTIve. If you are studying systematically, you can find the textbooks in the probability statistics of the university. Optimization methods are widely used in the training of machine learning models. Several optimization concepts commonly found in machine learning: convex/nonconvex functions, gradient descent, stochastic gradient descent, and original dual problems. General machine learning materials or courses will teach you a bit of optimization, such as the Convex OpTImization Overview by Zico Kolter in the Andrew Ng Machine Learning course. Of course, the best way to understand the system is to look at Boyd's "Convex Optimization" book, and the corresponding PPT (https://web.stanford.edu/~boyd/cvxbook/) and course (https://see. Stanford.edu/Course/EE364A, https://see.stanford.edu/Course/EE364B). Students who like to read the code can also look at the optimization methods involved in the open source machine learning project, such as Liblinear, LibSVM, Tensorflow is a good choice. Some commonly used math Python packages: NumPy: scientific calculation package for tensor calculation SciPy: Mathematical Computing Toolkit for Science and Engineering Matplotlib: drawing, visualization package Andrew Ng's "Machine Learning" course is still an introductory artifact in the field of machine learning. Don't underestimate the so-called introduction, you can understand the knowledge inside, you can apply for the position of algorithm engineer. Recommend several well-recognized textbooks: Hastie et al., The Elements of Statistical Learning, Bishop's Pattern Recognition and Machine Learning, Murphy's Machine Learning: A Probabilistic Perspective, and Zhou Zhihua's Watermelon Book Machine Learning . Deep learning materials recommend Yoshua Bengio's "Deep Learning" and the official tutorial of Tensorflow. Some commonly used tools: Scikit-learn: Python package containing various machine learning models Liblinear: A variety of efficient training methods including linear models LibSVM: A variety of efficient training methods including various SVMs Tensorflow: Google's deep learning framework PyTorch: Facebook's deep learning framework Keras: High-level deep learning use framework Caffe: Old-fashioned deep learning framework Many universities have NLP-related research teams, such as the Stanford NLP group, and the domestic Harbin Institute of Technology SCIR laboratory. The dynamics of these teams are worthy of attention. NLP-related information is available online. The course recommends Stanford's "CS224n: Natural Language Processing with Deep Learning". The book recommends Manning's "Foundations of Statistical Natural Language Processing" (Chinese version is called "Statistical Natural Language Processing Fundamentals"). For information retrieval, Manning's classic book "Introduction to Information Retrieval" (Chinese version of "Introduction to Information Retrieval" translated by Wang Bin) and the Stanford course "CS 276: Information Retrieval and Web Search" are recommended. Some commonly used tools: Jieba: Chinese word segmentation and part-of-speech tagging Python package CoreNLP: Stanford's NLP Tools (Java) NLTK: Natural Language Toolkit TextGrocery: Efficient short text categorization tool (Note: only for Python 2) LTP: Harbin Institute of Technology's Chinese natural language processing tool Gensim: a text analysis tool that contains a variety of topic models Word2vec: Efficient word representation learning tool GloVe: Stanford's word representation learning tool Fasttext : Efficient word representation learning and sentence classification library FuzzyWuzzy: A tool for calculating the similarity between texts CRF++: Lightweight Conditional Accessory Library (C++) Elasticsearch: Open Source Search Engine The dialogue system uses different frameworks technically for different types of users. Here are a few different types of dialogue robots. The pigtail refers to an optical fiber or optical cable with an optical fiber connector installed at one end and an optical fiber or optical cable at the other end. Divide an optical jumper into two to become two optical pigtails. Optical pigtails are usually used for the end of the optical path (such as the actual test result box of the terminal point pair, the splice tray in the wiring equipment, etc.). Or the extraction of optical devices (such as optical splitters, lasers, detectors, etc.). The pigtail length is usually no more than 2 meters. Fiber Optic Pigtails,Fiber Connectivity,Lc Pigtail,Fc Pigtail,Sc Pigtail Shenzhen GL-COM Technology CO.,LTD. , https://www.szglcom.com
Figure 1 Dialogue System Skill Tree Mathematics
Same as the optical jumper, when the connecting wire is an optical cable (mostly indoor optical cable), it is called an optical fiber pigtail, and when the connecting wire is an optical fiber (usually a tight-buffered optical fiber), it is called an optical fiber pigtail. There is no special product standard for optical pigtails. Most buyers and sellers switch to "arbitrary type" in the form of optical patch cords when they deliver. The quality acceptance is the same as optical patch cords, which also apply optical fiber movable connector standards.
Pigtails are divided into multi-mode pigtails and single-mode pigtails. The multimode pigtail is orange, the wavelength is 850nm, the transmission distance is 5Km, and it is used for short-distance interconnection. The single-mode pigtail is yellow, with two wavelengths, 1310nm and 1550nm, and transmission distances of 10km and 40km, respectively.
Fiber is an important component of the optical communication system, which is mainly used to realize the two functions of the interconnection of the optical ports between the devices and the interconnection of the device and the fiber core of the optical cable. Different from conventional cables, the pigtail core wire has the characteristics of easy breakage and weak tensile performance, and there is no mature on-site processing plan for the interface components, and it is impossible to make a pigtail with a suitable length on site according to the actual distance. Therefore, in practical applications, the pigtails are usually factory-processed and manufactured according to a certain nominal length series. When installing and constructing on site, engineers can choose pigtails that are longer than the actual distance. Because pigtails have the characteristics of discretization and easy damage, pigtail reeling is the core link in the installation, return and storage of pigtails.
The inner core of the pigtail uses silica glass filaments to carry the optical path. The body is fragile and easy to break. The main line is usually a 48-core ADSS optical cable. These pigtails need to be sheathed in a corrugated tube and placed in the floor compartment or cable sandwich. There is no effective tool assistance in the traditional pigtail threading method, and the operation and maintenance personnel adopt the traditional brute force method to pass the pigtail through the corrugated tube, which causes more fiber jumper damage and high probability of service interruption.