How strong is the "seeking" chip according to the figure? Interpret the four core advantages and restore the core-making method!

  Aspect: The intelligent density surpasses that of NVIDIA, and the visual processing performance of autonomous driving is comparable to that of Tesla FSD.

  Zhidongxi reported on May 9 that this morning, Etu Technology held a quite different product launch conference in Shanghai.

  This time, according to the map, it is not face recognition or speech recognition, but a customized SoC chip for cloud deep learning reasoning-questcore, Chinese name "Seeking"!

  This chip will be officially commercialized from now on, and in addition, according to the figure, a series of products and industrial solutions based on this chip are also displayed.

  Today, Zhu Long, co-founder and CEO of Yitu, appeared at the press conference for the first time. He put forward the slogan "Algorithms are chips" and shared some views on AI chips.

  According to reports, this is the world’s first deep learning cloud customized SoC chip, which has been mass-produced. The Eto Atomic Server equipped with questcore will provide security services for the second World Import Expo to be held in Shanghai in November this year.

  The release of questcore means that on the artificial intelligence (AI) track, one of the four computer vision (CV) unicorns has transformed itself into an all-round AI company whose products cover vision to voice, and from software to hardware through the triple jump of "vision-voice-chip".

  According to the figure, what are the highlights of questcore chip? What are the characteristics of ManyCore, its innovative architecture? What help did this chip get? What are the advantages of core-making according to the figure? What is the inevitable trend of AI chip development? ….. After in-depth communication with the insiders of Yitu, Zhizhi will reveal a clearer layout of Etu questcore chip for you.

  The world’s first deep learning cloud customized SoC chip

  Yitu launched its first self-developed chip questcore, called "Seeking" in Chinese, which is the embodiment of Yitu’s pursuit of "extreme intelligence" and was jointly developed by Yitu Technology and AI chip startup team ThinkForce.

  This deep-learning cloud-based customized SoC chip is fully localized from design to manufacturing, and has independent intellectual property rights. Based on the concept of Domain Specific Architecture (DSA), it is specially designed for computer vision applications, accelerating different operations in the visual field, and is suitable for many visual reasoning tasks such as face recognition, vehicle detection, video structured analysis, pedestrian re-recognition, etc.

  At present, questcore chips will be used in cloud and edge servers according to the diagram, combined with intelligent visual analysis software according to the diagram, and sold as an integrated software and hardware solution.

  At this conference, Leo Zhu, co-founder and CEO of Eto, introduced four design concepts of Eto chip: high density, world-class AI algorithm, 64-channel video analysis and server AI chip.

  Leo Zhu said that questcore chip is not an AI acceleration module, but a complete AI processor with end-to-end capability.

  As a cloud server chip, it can run independently without relying on Intel x86 CPU. Although it was born for server chips, questcore supports both the cloud and the edge.

  The specifications and parameters of questcore chip are as follows:

  According to reports, in the actual cloud application scenario, according to the figure, questcore can provide visual reasoning performance of up to 15 TOPS per second, and the maximum power consumption is only 20W, which is smaller than an ordinary light bulb.

  At the same power consumption, questcore’s visual reasoning performance is 2~5 times that of the existing mainstream similar products in the market, and its security camera’s single-channel power consumption is only 30% of that of NVIDIA GPU P4.

  Leo Zhu said that the development of this chip according to the map is not to pursue the computing power of hundreds of T’s like NVIDIA, but to value high computing density.

  Questcore chip has high integration, can efficiently adapt to various deep learning algorithms, has good model compatibility and high scalability, supports various deep learning frameworks such as TensorFlow and PyTorch, and seamlessly accesses the existing ecology.

  It also comes with its own network support module, which supports virtualization and containerization, and improves the elastic computing and scheduling of AI cloud by an order of magnitude.

  The chip is suitable for accelerating all kinds of visual reasoning tasks, such as transportation, public safety, smart medical care and smart retail, especially in the enterprise environment with strong demand for applications such as real-time analysis of intelligent video in the cloud.

  At the scene, Leo Zhu also specifically mentioned the recently released Tesla fully automatic driving (FSD) chip. He said that the Yitu chip is similar to the Tesla chip. Tesla started to build it three years ago, while Yitu only took two years.

  Leo Zhu said that Yitu won’t do self-driving, but unmanned car companies are welcome to cooperate.

  Release Atomic Series Cloud Servers and Edge Boxes

  In addition to the introduction of the chip, Yitu also released the cloud server of Yitu Atomic Series based on questcore chip and the edge-oriented Yitu Frontier Series edge box.

  Lu Hao, the chief innovation officer of Eto Technology, also gave a plug-in demonstration at the scene, holding an Eto Atomic Server with a volume equivalent to that of a 15-inch Apple MacBook Pro notebook, which consists of four questcore chips.

  According to the map, 200 cameras were used to collect the faces of the audience on the spot, and real-time face recognition comparison was successfully carried out.

  Lu Hao showed the internal structure of the server, which is much thinner than the conventional server chassis and can be lifted with one hand.

  According to the figure, the atomic server is built on questcore. One server provides the same computing power as eight NVIDIA P4 card servers, but the volume is only half of the latter, and the power consumption is less than 20%.

  Lu Hao said that an NVIDIA P4 graphics card costing about $2,000 can only support 27 cameras, while the server according to the map can drive 200 channels of real-time decoding and video analysis, and the power consumption does not exceed 250W W.

  When analyzing video, the power consumption of a single-channel video analysis server is only 20% compared with an 8-card NVIDIA T4 server (including dual-core Intel x86 CPU) and about 10% compared with an 8-card NVIDIA P4 server (also including dual-core Intel x86 CPU).

  Lu Hao said that questcore chip can basically complete all visual analysis, can be used for lung cancer diagnosis, children’s bone age prediction and many smart city applications, and is expected to become a visual center for many visual tasks in the future.

  According to reports, the video analysis system built according to questcore reduces the original scheme of 16 cabinets to one, which reduces the overall construction cost by 50% and the operation and maintenance cost by 80%.

  In addition, according to the figure, the atomic server can directly upgrade the system in the cloud, and there is no need to purchase or update the existing terminal equipment such as cameras and sensors on a large scale, which greatly improves the utilization rate of the existing infrastructure.

  Leo Zhu said that in the future, he hoped that questcore chip could make 10,000 intelligent video analysis a standard.

  Questcore’s biggest feature: flexible and extensible, taking into account both cloud and edge.

  Questcore adopts a self-developed chip architecture, which is flexible and extensible except for the acceleration of machine vision reasoning operation. Different application scenarios are fully considered in the design, and it can take into account the visual reasoning calculation requirements of cloud and edge.

  1, can be used as a server chip, can also run independently.

  SoC is System-on-Chip, which is called "system on chip" in Chinese. Generally, it refers to integrating multiple IPS on the same chip.

  Questcore adopts customized SoC solution, and customers can choose different configurations according to different usage requirements, so that customers can customize as needed, which has obvious advantages in performance, cost, power consumption, reliability, life cycle and application scope.

  According to the figure, this is the inevitable trend of the development of integrated circuit design.

  In addition, SoC has obvious advantages in performance, cost, power consumption, reliability, life cycle and application scope, and it is an inevitable trend of integrated circuit design and development.

  Compared with the AI accelerator products developed by NVIDIA GPU, Google TPU and other AI chip companies, Yitu chip is different in that it can run independently as a server chip without relying on Intel x86 CPU.

  2. Self-developed chip architecture, and the algorithm team is deeply involved in core building.

  Now Moore’s law is close to the physical limit, but the performance of the algorithm is still growing ten thousand times. In the past four years, the accuracy of face recognition algorithm based on graph has increased by 100,000 times. In this context, AI is pushing the computing industry into the era of "algorithm is chip".

  The existing computer architecture can no longer meet the increasing demand of AI for computing power, and domain-specific architecture (DSA) is becoming the mainstream of future computing. A groundbreaking example is Google TPU, which accelerates deep neural network (DNN).

  Therefore, according to the figure, think about computing and computer architecture with AI as the center.

  According to the figure, in the era when the algorithm is the chip, a good algorithm can use the chip architecture more efficiently, or guide the design of the chip architecture, and at the same time transform the computing power into intelligence more efficiently.

  Questcore is based on the concept of Domain Specific Architecture (DSA), and optimizes the application domain.

  For DSA chips, the understanding of AI Domain Knowledge is the most important, including the development trend of algorithms, practical application scenarios and insight into specific business logic.

  Etu’s long-term cultivation in the visual field has made it have a deep understanding of machine vision technology and industry, thus laying the foundation for it to build a better visual AI chip.

  From the very beginning, the Etu algorithm team has been deeply involved in the design of the chip. Through the self-developed architecture, it can maximize the computing power and give full play to the world-leading intelligent algorithm performance of Etu while maintaining extremely low power consumption.

  ManyCore architecture efficiently adapts to all kinds of deep learning algorithms, has good model compatibility, and supports various deep learning frameworks such as TensorFlow, PyTorch and Caffe, which facilitates seamless access to the existing ecology.

  According to the figure, the biggest advantage of this architecture is its strong scalability. In the design, the requirements of cloud and edge deep learning reasoning calculation are considered at the same time, and it can be dynamically expanded according to actual application scenarios.

  3. The key to achieve excellent performance and power consumption ratio is only for the data acceleration of INT 8.

  According to the picture, it tells Zhizhi that it is the most cost-effective solution at present, considering the demand for visual analysis applications in the industry at present and in the future.

  Combined with the long-term practice and research in the visual field, the chip architecture is developed according to the diagram, which only accelerates the INT 8 data (8-bit integer data type), which is one of the fundamental reasons why questcore achieves an order of magnitude improvement in performance and power consumption ratio.

  Earlier, Qualcomm said that by 2025, the data center inference accelerator market may reach $17 billion. Different from AI training, AI reasoning does not need high precision, and the low precision data types of INT 8 or even INT 4 are enough to meet the needs of most intelligent video analysis/visual reasoning calculations in the cloud.

  However, there are few AI chips in the market to accelerate this demand or low-precision data type reasoning.

  4, the combination of software and hardware, flexible adaptation infrastructure.

  Yitu chip will be combined with Yitu intelligent software to form a product or solution with integrated software and hardware for external sales, which can flexibly adapt to customers’ existing software and hardware infrastructure.

  The reason why the integration of software and hardware is emphasized is that there is no software or hardware that can be used alone in the market, and the whole industry and the whole ecology must be closely combined. After the launch of products and solutions, it is only the beginning, and subsequent customer services such as operation and maintenance are also the key factors to consider whether the products can succeed.

  Now questcore chip is introduced according to the figure, which not only has a flexible and extensible architecture, but also can be used alone, without relying on Intel x86 CPU, making products and solutions based on questcore take into account the needs of cloud and edge computing.

  Next, Yitu will continue to invest in original and advanced AI software and hardware technology research and development, and strive to bring better software and hardware integrated products and industry solutions to customers and the market.

  Core-making logic according to the figure: making chips is not for the sake of making chips.

  Yitu is one of the four CV unicorns in China. It is famous for its world-leading AI technology, and its last valuation has reached 15 billion yuan.

  For example, in the Global Face Recognition Authority Test (FRVT) held by the National Institute of Standards and Technology (NIST), Yitu won the first place for three consecutive years.

  Leo Zhu, co-founder and CEO of Yitu, believes that the key to the popularity of AI is intelligent density, the intelligence of machines is cheaper, and the essence of cheapness is density, which is divided into two dimensions:

  The first is macro, from single intelligence to machine intelligence to group intelligence, for example, let a single camera recognize faces grow to 10,000, and the world recognized by these 10,000 cameras can communicate and make decisions.

  The second is that at the microscopic level, it is enough for a single computer to support the intelligent creativity just mentioned. Here I am talking about the intelligent computing power, not the computing power of a simple machine.

  In the field of AI algorithm, why should we go out of our comfort zone and choose the self-developed AI chip, which is not a good way to go?

  Yitu firmly believes that in the intelligent era with no precedent to follow, China AI enterprises stand on the same starting line as the world’s technology giants, and have every chance to become giants in the new era.

  Adhering to this belief, in the process of promoting the landing of intelligent projects, Yitu deeply felt the necessity and urgency of customizing AI chips for application scenarios and business logic, and foresaw that the integration of software and hardware was the inevitable development direction of AI technology.

  Yitu revealed to Zhidx that the main reason for its self-developed chip is not for the commercialization of the chip, but for the sake of the scene, to give full play to the performance of Yitu algorithm and software, and to provide customers with an integrated solution with optimal performance, power consumption and cost for specific scenes.

  This is similar to the logic of self-developed chips of Google, Microsoft, Ali and other companies.

  Under the background that the existing computer architecture can’t meet the demand of AI for computing power, Moore’s Law is nearing the end, and the performance of the algorithm is still rocket-like, according to the figure, I realized early that I should think about the computer architecture with AI as the center.

  They firmly believe that in this new round of architecture revolution, algorithms and chips are closely coupled and inseparable. Only AI companies that know algorithms can make better AI chips and transform computing power into intelligence more efficiently.

  If any new technology is to be popularized on a large scale, the basic premise is to reduce the price/cost to the range that most people can use. Only in this way can the product be truly popularized.

  Yitu hopes that by developing AI chips, it can promote the real landing and wide-ranging spread of AI technology.

  Core-building for a year and a half, revealing the hero behind it

  The core-making plan according to the figure initially surfaced at the end of 2017.

  In December last year, Yitu officially announced the strategic investment in the AI chip startup team, Zhizhi Electronic ThinkForce. As the public information shows, Yitu is a shareholder and partner of ThinkForce.

  ThinkForce is a rare team in China that has the full link capability of chip research and development. Its core members come from leading companies in the chip industry, such as IBM, AMD, Intel, LSI, Broadcom, Cadence, ZTE, etc. All of them have more than 10 years of experience in the chip industry, and they have profound attainments in chip design, architecture, algorithm research and other fields, and have handled mass production of more than 40 different chips, with total sales reaching billions of dollars.

  ThinkForce’s AI chip has innovative heterogeneous computing architecture, complete functional modules and advanced semiconductor manufacturing technology, which can complete more deep neural network matrix operations in a shorter time with lower power consumption.

  In order to build a friendly development environment and a healthy industrial ecology, ThinkForce has also prepared easy-to-use software tools to facilitate users to complete the hardware deployment of the algorithm.

  The algorithm team and ThinkForce hardware team worked closely together from the beginning, and the questcore chip developed can greatly exert the accumulation of graph in algorithm and software.

  Conclusion: the integration of software and hardware into the inevitable direction of AI landing

  The advent of questcore chip not only shows the vertical integration ability from software to hardware according to the diagram, but also adds a market choice for domestic data center server AI chip.

  With the mass production of questcore chips, Yitu has become the most comprehensive AI enterprise in the current technical layout. The product array covers computer vision, natural language processing, speech recognition and AI chips, and maintains the world-leading technological advantages in these fields.

  Judging from the choice of making cores according to drawings, under the general environment that Moore’s Law is approaching the bottleneck and general-purpose chips are difficult to meet all requirements, it is becoming a new trend to customize AI chips for application scenarios and business requirements, such as Google TPU and Baidu Kunlun.

  At the same time, the relationship between algorithm and chip is closer, and the integration of software and hardware is becoming the inevitable direction of AI landing. In the battlefield of AI chips with strong hands and giants, if a company has a solid foundation of AI algorithm, excellent vertical integration ability of software and hardware, and can tailor its computing power according to its own advantages and application requirements, it will undoubtedly be of great benefit to improve its product competitiveness and technical barriers.

  This account number is Netease News. NeteaseNo. has its own attitude.

  Live broadcast preview of Zhidongxi open class

  At 7: 00 pm on May 16th, Liang Jin, technical director of GTI Beijing Center, will give a lecture on the tenth lecture of the AI chip application series, with the theme of "How to accelerate image processing with an integrated AI chip". Scan the code into the live broadcast room and turn on the "class reminder" in case you miss it!

This article first appeared on WeChat WeChat official account: Wise Things. The content of the article belongs to the author’s personal opinion and does not represent Hexun.com’s position. Investors should operate accordingly, at their own risk.

(Editor: He Yihua HN110)