AWS – Analytics India Magazine https://analyticsindiamag.com AIM - News and Insights on AI, GCC, IT, and Tech Fri, 07 Mar 2025 12:10:34 +0000 en-US hourly 1 https://analyticsindiamag.com/wp-content/uploads/2025/02/cropped-AIM-Favicon-32x32.png AWS – Analytics India Magazine https://analyticsindiamag.com 32 32 AWS Launches New Division to Develop Agentic AI https://analyticsindiamag.com/ai-news-updates/aws-launches-new-division-to-develop-agentic-ai/ Wed, 05 Mar 2025 06:43:20 +0000 https://analyticsindiamag.com/?p=10165158 Amazon is set to release a new reasoning model under its Nova branding by June this year. ]]>

Amazon Web Services (AWS) has started a new division focused on agentic AI to improve automation for its users and businesses. According to Reuters, the announcement was made in an internal email by AWS CEO Matt Garman.

The group will be led by Swami Sivasubramanian, who previously served as AWS’s vice president of AI and data. He will now report directly to Garman.

“Agentic AI has the potential to be the next multi-billion business for AWS,” Garman wrote in the email, highlighting the company’s commitment to advancing AI-driven automation. 

Agentic AI refers to systems that can execute tasks independently without requiring constant user prompts. Amazon recently showcased some of these capabilities with an updated version of its Alexa voice assistant, which is expected to roll out later this month.

“We have the opportunity to help our customers innovate even faster and unlock more possibilities, and I firmly believe that AI agents are core to this next wave of innovation,” he added.

Additionally, AWS senior vice president Peter Desantis announced restructuring within the company’s AI and hardware divisions. This includes merging AI groups Bedrock and SageMaker under the compute organisation and forming a new team to streamline customer experience and commerce. Desantis stated that these changes will “accelerate innovation”.

Meanwhile, Amazon is set to release a new reasoning model under its Nova branding by June this year, Business Insider reported. The model will function with a hybrid approach, meaning it can provide quick responses or use ‘extended thinking’ for more complex queries.

]]>
AWS Unveils New Cloud Solutions to Boost 5G Networks for Telcos https://analyticsindiamag.com/ai-news-updates/aws-unveils-new-cloud-solutions-to-boost-5g-networks-for-telcos/ Mon, 03 Mar 2025 12:33:26 +0000 https://analyticsindiamag.com/?p=10165002 The new AWS Outposts racks are built for high-speed 5G Core user plane and RAN workloads.]]>

At the Mobile World Congress held in Barcelona, Amazon Web Services (AWS) on Monday announced AWS Outposts racks for high throughput, network-intensive workloads and AWS Outposts servers designed for Cloud Radio Access Network (C-RAN) workloads.

These new offerings enable telecom service providers, also known as telcos, to extend AWS infrastructure and services to deploy on-premises network functions requiring low latency, high throughput, and real-time performance.

Both offerings will be generally available later this year to support hosting 5G Core User Plane Function (UPF), RAN Centralised Unit (CU), and RAN Distributed Unit (DU) workloads.

“With the new AWS Outposts offerings, telcos can now run their entire 5G network, including 5G Core and 5G RAN, on AWS cloud services. These innovations will allow faster network deployment, better price performance, and improved customer experiences,” David Brown, vice president of compute and networking at AWS, said.

The new AWS Outposts racks are built for high-speed 5G Core user plane and RAN workloads. Telecom companies can place workloads in different locations depending on speed, latency, and data traffic needs. The system uses 4th Gen Intel Xeon Scalable Processors and a high-performance network fabric.

The company said the AWS Outposts racks offer scalability to handle increasing data traffic demands, ensuring telecom networks can expand efficiently. They provide enhanced security and performance through AWS’s Nitro System, delivering a reliable and protected environment. 

Moreover, the racks support automated deployment and management with AWS Kubernetes services, streamlining operations for telecom providers. 

Integrating with AWS analytics and monitoring tools also improves efficiency, allowing operators to monitor and optimise network performance seamlessly.

Notably, O2 Telefónica, a major telecom provider in Germany, is already using AWS for its cloud-based 5G core network.

Meanwhile, Mallikarjun Rao, chief technology and enterprise officer at O2 Telefónica, said, “We are proud to have the first fully cloud-native 5G core network deployed on AWS, serving a million subscribers. The new AWS Outposts racks align with our strategy to build the network of the future.”

The AWS Outposts servers are tailored for Cloud RAN workloads, helping telecom providers deploy virtualised 5G networks more efficiently. These servers have been developed in collaboration with Nokia, with plans to integrate additional RAN vendors in the future.

Why AWS Outposts?

The AWS Outposts servers offer simplified operations with a pre-integrated cloud infrastructure, reducing the complexity of deployment and management. 

They enable faster 5G innovation by providing access to over 200 AWS cloud services, allowing telecom providers to develop and launch new features more efficiently. Powered by AWS Graviton3 processors, these servers deliver high-performance computing to meet the demanding requirements of 5G networks. 

Moreover, they ensure seamless integration with RAN vendors, maintaining high radio performance and supporting the smooth deployment of Cloud RAN solutions.

Major telecom operators like Orange, and Du Network will begin testing these solutions later this year.

AWS’s new Outposts racks and servers are currently in preview and will be widely available later this year.

]]>
AWS Automates KYC and Fraud Detection—Makes Banks Failproof https://analyticsindiamag.com/global-tech/aws-automates-kyc-and-fraud-detection-makes-banks-failproof/ Mon, 03 Mar 2025 08:39:25 +0000 https://analyticsindiamag.com/?p=10164971 Many fintechs and FSI firms are already using AI with AWS, including Dhan, HDFC Securities, Fibe, and Axis Bank.]]>

The Indian Economic Survey 2025 indicated the growing adoption and impact of generative AI within India’s banking sector. It pointed out that several financial institutions and banks in India are increasingly leveraging AI to enhance their operations, improve customer experiences, and streamline services. 

However, alongside these advancements, challenges related to security and scaling remain. This is what AWS is trying to solve, offering solutions designed to address these concerns.

In an exclusive interview with AIM, Kiran Jagannath, head of financial services and conglomerates at Amazon Web Services (AWS) India and South Asia, revealed that banks are now open to using generative AI services.

According to him, AWS is helping BFSI companies integrate generative AI securely and efficiently. Many fintechs and FSI firms have already begun their generative AI journey using AWS with key adopters, including Dhan, HDFC Securities, Fibe, and Axis Bank.

He shared that for the stockbroking fintech startup Dhan, KYC timelines were long, and the company wanted to address this issue. The company used generative AI to shorten these timelines, automating 25% of the KYC process and reducing wait times by 50%. “They achieved this with a 30% reduction in operational costs,” Jagannath added.

The startup developed a chatbot solution based on LLM and retrieval-augmented generation (RAG) technology, utilising Amazon Lex, Amazon Bedrock knowledge base, and Amazon Bedrock agents. The GenAI chatbot integration automated KYC queries, with multilingual voice and text support, enabling 24/7 customer support with the flexibility to route conversations to live agents, accompanied by a summary of the chat history, based on the user’s preference.

Regarding another customer, Razorpay, Jagannath said the company used generative AI to reduce payment failures. “Payment failures are still quite common today, and they have a significant impact. Whether in e-commerce or other sectors, a failed payment means a potential loss of sales for the customer.” 

The company recently launched Ray Concierge, an AI onboarding system that simplifies the often complex process of setting up payment gateways. 

Moreover, he added that generative AI has several applications in the BFSI sector, including fraud detection, customer experience, document summarisation, and process automation. Jagannath further explained that the financial sector handles vast amounts of data and documentation, and AI helps speed up processes such as underwriting, insurance claims processing, and customer support.

How is AWS Making UPI Possible?

Jagannath said that many payment providers today operate on AWS, and even on the enterprise side, AWS collaborates with several customers. “Every UPI payment, whether it is ₹5 or ₹5,000, impacts core banking systems. These platforms were not originally designed to handle such high transaction volumes. So, we are working with these customers to improve resiliency.” He said, citing RBI data, that over 47% of the world’s real-time payments happen in India.

Speaking of resilience, he said that the AWS ensures resiliency and scalability through its availability zone (AZ) architecture. “Cloud computing provides automatic scaling, allowing AWS to handle failures seamlessly without customer intervention,” he said. 

He revealed that each AWS region, such as Mumbai and Hyderabad, consists of multiple AZs. These AZs contain one or more physical data centres, which are isolated from each other by using different power grids, water sources, and other infrastructure. They are also interconnected with high-bandwidth, low-latency networks, ensuring minimal delay in communication.

The redundancy across AZs makes it extremely difficult for an entire AWS region to go offline unless a major disaster occurs. AWS automates failovers between AZs, meaning customers experience no visible downtime.

Moreover, he added that AWS is one of the most secure clouds and follows the security-by-design approach. “We have various concepts, like landing zone, where we help our customers, especially banks, develop these security guardrails and policies,” Jagannath said. 

A landing zone is a pre-configured environment where security policies and guardrails are automatically applied. He explained that developers can write code within these security boundaries, ensuring compliance with policies from the start.

AWS Efforts to Bring AI to Everyone

Jagannath believes that more enterprises and customers will adopt generative AI services if they receive proper training. He said that AWS has trained about 5.9 million individuals in India. 

“We do a lot of outreach to our developers across whether they’re working for large banks or working for large system integrators, it doesn’t matter for us, because they are the ones who are the pillars of foundations on how to adopt technology.”

Moreover, in 2025, AWS announced plans to invest $8.3 billion into cloud infrastructure in the AWS Asia-Pacific (Mumbai) Region in Maharashtra to further expand cloud computing capacity in India. This investment is estimated to contribute $15.3 billion to India’s gross domestic product (GDP) and support more than 81,300 full-time jobs annually in the local data centre supply chain by 2030. 

“Our investments and operations in India are enabling customers of all segments to experiment and build technology applications and platforms, re-invent industries and their business models, and power their growth,” Jagannath concluded.

]]>
AWS Announces New Quantum Chip Ocelot, Reduces Error Correction Costs by 90% https://analyticsindiamag.com/ai-news-updates/aws-announces-new-quantum-chip-ocelot-reduces-error-correction-costs-by-90/ Thu, 27 Feb 2025 12:30:57 +0000 https://analyticsindiamag.com/?p=10164773 The company uses a novel design for the chip’s architecture using ‘cat qubits’ as an ode to Schrödinger’s thought experiment. ]]>

Amazon Web Services (AWS) announced on Thursday a new quantum computing chip called Ocelot. The company claims that compared to current approaches, the chip can reduce the costs of implementing quantum error correction by up to 90%. 

Ocelot was developed by the AWS Centre for Quantum Computing and was based out of the California Institute of Technology. 

“Ocelot represents a breakthrough in the pursuit of building fault-tolerant quantum computers capable of solving problems of commercial and scientific importance that are beyond the reach of today’s conventional computers,” the company said. 

The company uses a novel design for the chip’s architecture using ‘cat qubits’ as an ode to Schrödinger’s thought experiment. 

AWS said cat qubits intrinsically suppress certain errors, reducing the resources required for quantum error correction.

This approach led the researchers at AWS to combine cat qubit technology and additional quantum error correction components into a microchip that can be manufactured for scale. 

Error correction is essential for quantum computing, as these systems are highly sensitive to noise or disruptions. 

AWS addresses this problem by building error correction into the architecture from the ground up, a different approach from others that incorporate error correction after implementing the architecture.

“We selected our qubit and architecture with quantum error correction as the top requirement. We believe that if we’re going to make practical quantum computers, quantum error correction needs to come first,” said Oskar Painter, director of quantum hardware at AWS. 

AWS has also published a technical research report on Nature, outlining the new technology. 

The company revealed that Ocelot is still a prototype and will continue to invest in further research.

Recently, Microsoft announced the Majorana 1 quantum chip, calling it the world’s first quantum chip that uses a new ‘Topological Core’ architecture. It can hold one million qubits on a single chip, slightly larger than desktop computer CPUs.

The chip uses a novel material called a ‘topoconductor’ or topological superconductor to control Majorana particles, leading to more reliable qubits. 

]]>
The Rise of Uni-Unicorns https://analyticsindiamag.com/global-tech/the-rise-of-uni-unicorns/ Mon, 10 Feb 2025 11:25:49 +0000 https://analyticsindiamag.com/?p=10163149 As India’s digitisation and cloud push fuel unicorn growth, AWS is aiming to invest $13 billion in infrastructure by 2030. ]]>

India is on the brink of a new era in entrepreneurship—one in which billion-dollar startups would be created and scaled by one person rather than a founding team. Jeff Barr, AWS’s chief evangelist, believes AI-powered tools like Amazon Q Developer will fuel the rise of “uni-unicorns”, turning solo founders into global tech disruptors.

India’s Developer Goldmine

On his visit to India, Barr was struck by the sheer scale of the country’s developer community. With companies like TCS, and Infosys housing over 300,000 developers each, the numbers dwarf even major global tech hubs.

“Coming from Seattle, where the whole city has a population of 900,000, it’s incredible to see a single company in India employing nearly a third of that,” Barr remarked in an exclusive conversation with AIM

According to Barr, what sets Indian developers apart is their hunger to learn. Barr is right. India has a plethora of self-taught coders—individuals who, within months, transition from non-tech backgrounds to mastering C, C++, and Java, thanks to the wealth of free resources and AI code assistants available today.

The AI Leverage: From Dorm Rooms to Unicorns

AWS has long envisioned a future where a single developer in their dorm room could build the next global success story. With AI tools handling code generation, debugging, and deployment, that vision is rapidly becoming a reality.

“The hardest part of coding is the blank screen. AI eliminates that. Now, developers don’t start from scratch—they start with an intelligent assistant guiding them,” said Barr.

Amazon Q Developer is already delivering significant productivity gains. At Tata Consultancy Services, it has cut test generation time by 30%. Startups like Constems-AI have accelerated AI-powered image recognition features by 25%. 

At Amazon itself, Q Developer has saved an estimated 4,500 years of manual work and $260 million annually in performance improvements.

Amazon Q Developer vs the World

While AI code assistants like Microsoft Copilot, Cursor, Replit, and Devin AI are making waves, Amazon Q Developer claims to take a more comprehensive approach by embedding AI across the entire software development lifecycle. 

Unlike tools that focus on code generation, Q Developer assists with everything from writing test cases and documentation to conducting security reviews and optimising legacy codebases. This holistic integration gives developers more than just an autocomplete feature—it acts as a full-fledged coding assistant designed to enhance efficiency at every step.

“Developers do much more than just writing new code. There’s debugging, maintenance, security, and compliance—things that take up a huge part of their time. Q Developer helps with all of that, not just generating snippets of code, but actually improving the entire workflow,” said Barr, highlighting Amazon Q Developer’s distinction. 

He believes that by automating tedious tasks and reducing the grunt work, Q Developer enables developers to focus on problem-solving, innovation, and scaling their applications faster than ever before.

Recently, GitHub Copilot introduced Agent Mode, which enhances its ability to iterate on code, recognise errors, and fix them automatically. The company is also building Project Padawan, an AI agent designed to manage complex coding tasks and automate workflows.

The Next Big Tech Boom Will Be in India

India’s rapid digitisation, combined with deep investments in cloud infrastructure, is setting the stage for the rise of uni-unicorns. AWS has already trained 5.9 million individuals in cloud skills and is committing $12.7 billion to expand cloud infrastructure in India by 2030.

“With AWS regions across India and AI tools making development faster than ever, the barriers to building billion-dollar businesses are falling,” Barr said.

However, he emphasised that while AI accelerates development, human creativity remains at the core. “Developers are still in control. AI can suggest, but you make the final call,” he added.

While AI accelerates software development, the role of engineering is evolving. Developers are now required to choose between mastering low-level programming and becoming product builders who leverage AI.

What’s Next?

The tech industry is shifting. AI-enabled coding is no longer a futuristic concept—it’s happening now. With Indian developers at the forefront, Barr believes the next wave of global startups won’t come from Silicon Valley but from a solo developer in India, armed with AI, building the future.

“It’s an amazing time to be a developer,” Barr said.

]]>
AWS Commits ₹60,000 Crore for Data Centres in Telangana https://analyticsindiamag.com/ai-news-updates/aws-commits-%e2%82%b960000-crore-for-data-centres-in-telangana/ Thu, 23 Jan 2025 10:13:26 +0000 https://analyticsindiamag.com/?p=10162058 AWS has already invested $1 billion in three operational centres in Telangana and plans to invest an additional $4.4 billion by 2030.]]>

Amazon Web Services (AWS) has committed to investing ₹60,000 crore to establish new data centres in Telangana. 

The announcement was made following discussions at the World Economic Forum (WEF) in Davos, which was attended by Telangana Chief Minister Revanth Reddy, Telangana IT and industries minister Sridhar Babu and AWS representatives, including Michael Punke, vice president of global public policy at AWS.

The investment aligns with AWS’s broader expansion plans in India, which include developing Hyderabad’s data centre ecosystem. AWS has already invested $1 billion in three operational centres in Telangana and plans to invest an additional $4.4 billion by 2030.

Emphasising the importance of the deal, Reddy said, “Global businesses like Amazon have placed their confidence in us, marking a significant step in our Telangana Rising vision.”

The Telangana government is facilitating the expansion by providing additional land for AWS’s new projects. “With this deal, Hyderabad is set to become the data centre hub of India and a global leader in cloud services, including AI,” Babu said.

AWS’s investments are expected to strengthen Telangana’s position as a critical player in India’s digital and cloud infrastructure growth.

AWS has also announced a $8.3 billion investment in cloud infrastructure in Maharashtra as part of its expansion into the AWS Asia Pacific (Mumbai) Region. 

Meanwhile, Microsoft recently announced a $3 billion commitment to expand Azure’s infrastructure in the country at the Microsoft AI Tour in Bengaluru. The investment will scale Microsoft’s regional cloud infrastructure to bolster AI and computing capabilities. Moreover, Microsoft will train 10 million people in AI by 2030 as a part of its ADVANTA(I)GE INDIA initiative.

]]>
AWS to Invest $8.3 Billion in Cloud Infrastructure in Maharashtra by 2030 https://analyticsindiamag.com/ai-news-updates/aws-to-invest-8-3-billion-in-cloud-infrastructure-in-maharashtra-by-2030/ Thu, 23 Jan 2025 04:16:19 +0000 https://analyticsindiamag.com/?p=10162013 “This partnership will strengthen our technological infrastructure, create jobs, and drive economic growth,” CM Devendra Fadnavis said.]]>

Amazon Web Services (AWS) has announced a $8.3 billion investment in cloud infrastructure in Maharashtra as part of its AWS Asia Pacific (Mumbai) Region expansion. 

The announcement is part of AWS’s broader $12.7 billion commitment to India by 2030, aimed at meeting the growing demand for cloud services and artificial intelligence (AI) nationwide.

The Maharashtra government and AWS formalised the agreement on January 22, when they signed a memorandum of understanding (MoU) at the World Economic Forum in Davos.

“We are proud to be collaborating with AWS on this investment, which aligns with our vision to make Maharashtra a global capital for data centres,” Maharashtra CM Devendra Fadnavis said. “This partnership will strengthen our technological infrastructure, create jobs, and drive economic growth.”

“The investment in Maharashtra is estimated to add more than $15 billion to India’s GDP (gross domestic product) and support more than 81,000 full-time jobs in the local data centre supply chain annually by 2030,” Matt Garman, CEO at AWS, said

David Zapolsky, senior vice president of global public policy and general counsel at Amazon, said, “This collaboration supports the state’s digital ambitions and democratises access to emerging technology.”

AWS estimates the investment will contribute $15.3 billion to India’s GDP by 2030. These include sectors such as telecommunications, construction, electricity generation, facilities maintenance, and data centre operations.

AWS has already invested $3.7 billion in Maharashtra from 2016 to 2022. This latest funding aims to expand its cloud computing capacity in India, following its initial AWS Asia Pacific (Mumbai) Region launched in 2016 and the AWS Asia Pacific (Hyderabad) Region launched in 2022.

AWS’s infrastructure supports enterprises such as Axis Bank, HDFC Bank, and ICICI Lombard, along with startups like Healthify and public-sector organisations such as Coal India Limited and the Maharashtra State Electricity Distribution Company Limited.

The company is also fostering innovation through its AWS Partner Network (APN), which includes collaborators like TCS, PwC, Deloitte, and others to help Indian businesses build and scale digital solutions.

Since 2017, AWS has trained over 5.9 million individuals in cloud skills through initiatives like the AWS Skills to Jobs Tech Alliance and STEAM education programs in Maharashtra. Additionally, the company has supported skills and entrepreneurship training for women, job readiness programs for youth, and health initiatives in public schools.

Meanwhile, Microsoft recently announced a $3 billion commitment to expand Azure’s infrastructure in the country at the Microsoft AI Tour in Bengaluru. The investment will scale Microsoft’s regional cloud infrastructure to bolster AI and computing capabilities.

Moreover, Microsoft will train 10 million people in AI by 2030 as a part of its ADVANTA(I)GE INDIA initiative. 

]]>
How MongoDB Helped Zepto Reduce Latency by 40% https://analyticsindiamag.com/ai-features/how-mongodb-helped-zepto-reduce-latency-by-40/ Thu, 16 Jan 2025 04:00:00 +0000 https://analyticsindiamag.com/?p=10161493 “Behind Zepto’s 10-minute delivery promise is a database built for speed.”]]>

Zepto, the quick commerce giant, leverages MongoDB Atlas to overcome significant scalability challenges. “Behind Zepto’s 10-minute delivery promise is a database built for speed!” said Sachin Chawla, VP for India & South Asia at MongoDB, in a recent social media post. 

By shifting from a monolithic architecture with relational databases to MongoDB’s NoSQL platform, Zepto reduced latency for its critical APIs by 40%. This change also boosted its ability to handle six times more traffic, improving both operational efficiency and the overall customer experience.

The Challenge: Scaling Quick Commerce

Founded in 2021, Zepto revolutionised the Indian grocery delivery industry with its promise of 10-minute deliveries. However, as Zepto expanded—achieving $1.5 billion in annualised sales by mid-2024—its legacy infrastructure struggled to keep pace. 

Performance bottlenecks, high latency, and an inability to deliver real-time analytics hindered growth and operational efficiency.

A few months ago, during a presentation at MongoDB.local, Mayank Agarwal, senior architect at Zepto, explained: “We had a big monolith, and all the components were powered by PostgreSQL and a few Redis clusters. As our business grew, we faced many performance issues and restrictions on the velocity at which we wanted to operate.”

Zepto’s reliance on Redis clusters for caching also became a bottleneck as the business scaled. Kshitij Singh, technical lead at Zepto, noted, “We had a pretty huge Redis setup which we wanted to get rid of, as we were not able to scale it further. MongoDB’s in-memory caching solved this problem for us.”

Why MongoDB Atlas?

Zepto sought to break its monolithic architecture into microservices and migrate to a NoSQL database to address its issues. After evaluating multiple databases, MongoDB stood out due to its ability to handle nested document structures and array-based queries effectively.

“The queries were very performant, given the required indexes we had created, and that gave us confidence,” said Agarwal. “The biggest motivating factor was when we saw that MongoDB provides in-memory caching, which could address the huge Redis cluster that we couldn’t scale further.”

MongoDB’s Atlas platform also offered features such as real-time data archival and analytical nodes, enabling Zepto to separate customer-facing workloads from internal queries. Singh highlighted: “This ensured that customer performance wasn’t compromised.”

Results: Faster, Scalable, and Reliable

Zepto’s migration to MongoDB Atlas resulted in improved efficiency at an overall operational level. The company witnessed a huge reduction in latency, thereby significantly improving the response times of critical APIs and enhancing the customer experience. 

Additionally, the infrastructure scaled to handle 6x more traffic, ensuring no performance degradation. 

The migration to MongoDB also led to a 14% improvement in page load times, contributing to higher conversion rates and increased sales. MongoDB’s support for analytical nodes enabled Zepto to maintain consistent performance, ensuring effective real-time analytics across the platform.

The shift also streamlined Zepto’s ETL pipelines and archival processes, reducing management overhead and enabling faster deployment of new features.

“Post the migration, we observed a latency reduction of up to 40% for key APIs and handled throughput spikes, such as during campaigns, by enabling data compression and other optimisations. This has made our infrastructure more robust and scalable,” Singh said.

Scaling New Heights with AWS

Zepto’s success story is also powered by its strong partnership with AWS.

“We started Zepto about three years ago with just one dark store in Mumbai. Today, we’ve grown to over 200 dark stores and annualised sales north of INR 10,000 crore,” said Kaivalya Vohra, co-founder of Zepto. “From day one, we were clear that AWS was the way to go.”

AWS’ managed services and support allowed Zepto to launch new features rapidly while optimising costs. “AWS helped us achieve a quick MVP with credits from the AWS Activate program. As we scaled, the collaboration with AWS architects enabled us to come up with solutions no one had tried before,” noted Singh.

Zepto leverages AWS across the entire customer journey, from enhancing product discovery to streamlining last-mile logistics. Generative AI solutions on AWS have also enabled faster resolution times and improved customer satisfaction.

“AWS has been instrumental in helping us scale quickly while optimising costs. Its enterprise support and sponsorship for POCs have been invaluable,” Singh added.

The Pricing Dilemma 

Zepto is currently under fire for differential pricing algorithms on Android and iPhones. Several customers have expressed their frustration on social media platforms, speculating the possible reasons behind the stark difference in prices. 

Some believe the pricing is influenced by demographic and user device data, which could potentially favour Android users. The company has yet to officially comment on this issue.

]]>
AWS Launches New Cloud Region in Thailand to Boost Southeast Asia’s Digital Economy https://analyticsindiamag.com/ai-news-updates/aws-launches-new-cloud-region-in-thailand-to-boost-southeast-asias-digital-economy/ Fri, 10 Jan 2025 07:48:21 +0000 https://analyticsindiamag.com/?p=10161102 Thailand’s location in APAC provides a gateway to neighbouring markets, giving Indian businesses a chance to serve a broader audience. ]]>

Amazon Web Services (AWS) has launched its data centre infrastructure region in Asia Pacific (Thailand), allowing businesses in Thailand to store data locally and run applications faster. The region includes three availability zones to ensure reliability and support business continuity.

The company said it will invest over $5 billion in Thailand over the next 15 years. According to AWS’s blog, this investment is expected to contribute $10 billion to Thailand’s economy and create more than 11,000 jobs each year in construction, engineering, and telecommunications.

Paetongtarn Shinawatra, Prime Minister of Thailand, welcomed the development, stating, “This investment by AWS solidifies Thailand’s position as a regional hub for digital innovation and will help us build a more inclusive digital society.” 

Several Thai companies, including Charoen Pokphand Group and KASIKORN Business-Technology Group, use AWS to improve operations and scale their businesses. The company claims that government projects like the Big Data Institute’s Health Link, which aims to make healthcare data more accessible, will also benefit from the new infrastructure.

AWS’s entry into South Asia comes at a time when the demand for cloud services is rising. Organisations like the Stock Exchange of Thailand and startups such as Pomelo Fashion are using AWS to improve efficiency and support growth.

“We continue to see rapid cloud adoption across Asia Pacific as more customers unlock the full potential of the world’s most extensive, reliable, and secure cloud,” said Prasad Kalyanaraman, vice president of Infrastructure Services at AWS.

What’s in it for India?

This development creates new opportunities for indian startups looking to expand into Southeast Asia. Thailand’s upgraded cloud infrastructure allows Indian startups to access advanced AWS tools and services, such as artificial intelligence and machine learning, while complying with data regulations.

Thailand’s location in Southeast Asia provides a gateway to neighbouring markets, giving Indian businesses a chance to serve a broader audience. For startups in sectors like fintech, e-commerce, and SaaS, the reduced costs and better performance offered by AWS’s local region can make it easier to enter the region and compete effectively.

]]>
HERE, AWS Enter $1 Billion Partnership for New AI Mapping Solutions https://analyticsindiamag.com/ai-news-updates/here-aws-enter-1-billion-partnership-for-new-ai-mapping-solutions/ Tue, 07 Jan 2025 07:28:07 +0000 https://analyticsindiamag.com/?p=10160859 HERE will integrate AWS’s high-performance cloud capabilities with its proprietary AI and ML models to offer automakers cutting-edge location-aware software. ]]>

Netherlands-based location data and technology platform HERE Technologies and Amazon Web Services (AWS) have announced a decade-long collaboration to transform the development of software-defined vehicles (SDVs) and advance automotive innovation. 

The partnership, valued at $1 billion, will leverage AWS’s cloud infrastructure to power HERE’s AI-driven mapping solutions, enabling automakers and mobility companies to enhance electric, automated, and connected vehicle technologies, as announced by the company on Monday. 

Matt Garman, CEO of AWS, and Mike Nefkens, CEO of HERE, also took the initiative to make this announcement on LinkedIn at the CES 2025 event. 

SceneXtract Tool for Better ADAS

Under the agreement, HERE will integrate AWS’s high-performance cloud capabilities with its proprietary AI and ML models to offer automakers cutting-edge location-aware software. 

These advancements are expected to accelerate the development of advanced driver assistance systems (ADAS) through HERE’s new SceneXtract tool, automated driving features, and dynamic in-vehicle digital experiences.

This tool combines HERE’s HD Live Map data with AWS’s generative AI services. It enables automakers to quickly create simulation-ready environments, drastically reducing the time and costs associated with testing.

HERE’s New AI Assistant

Along with the partnership, HERE also unveiled its AI Assistant, a solution designed to revolutionise navigation and logistics for software-defined vehicles and transportation companies. 

By leveraging multiple GenAI LLMs, the AI assistant delivers location-aware guidance and natural language-driven insights. The assistant enhances personalised travel planning, enabling users to receive tailored route suggestions based on driving habits, real-time conditions, and specific preferences. 

Key features include complex travel planning, intelligent electric vehicle (EV) routing, and improved vehicle safety through precise map data integration. For instance, EV users can easily locate charging stations with specific amenities, while advanced safety systems provide real-time alerts on speed limits and hazardous conditions.

The solution, set for integration into HERE Navigation and the HERE SDK, will also be available in the HERE WeGo Pro mobile app for logistics companies by 2025. The app aims to optimise fleet management with natural language controls and ensure safety and efficiency on the road.

Hyper-Accurate Vehicle Navigation

Detailed and live location information is becoming essential for ADAS and automated driving, which ensures vehicles know their exact position in real time. 

This precision allows cars not only to react to their surroundings but also to anticipate and adapt, smoothing onboard decision-making and optimising routes, even for multi-stop EV charging.

“Location technology is at the heart of the automotive industry’s software-defined vehicle revolution. This partnership enables our customers to leverage our state-of-the-art location technology for faster software development and real-time data analytics throughout the entire SDV lifecycle,” Nefkens said. 

HERE’s live mapping solutions aim to improve navigation, EV efficiency, and route optimisation. The new tools promise to reduce software development time, enabling faster deployment of innovations while addressing complex data processing requirements.

They also empower developers with advanced location-based tools and geospatial data capabilities. These include dynamic and static maps, route planning, place search, geocoding, and device tracking, enabling businesses to enhance navigation, asset monitoring, and operational efficiency.

Transforming Logistics with AI-Driven Solutions 

The companies also formed a strategic collaboration agreement (SCA) beyond the automotive industry, which introduced innovative transportation and logistics solutions. 

Built on HERE’s location intelligence and AWS’s cloud infrastructure, these tools optimise supply chains, enhance real-time asset tracking and shipment visibility, and support sustainable delivery goals for enterprises and independent software vendors.

The collaboration solidifies HERE’s role as a key player in location intelligence and provides tools for developers to create essential geospatial services and reimagine mobility solutions.

In 2020, HERE Technologies also partnered with Deduce Technologies to enhance real-time traffic solutions in India by integrating Deduce’s extensive GPS probe data from commercial vehicle fleets. 

Last month, General Motors (GM) abandoned its robotaxi business to focus on personal autonomous vehicles. This restructuring aimed to cut annual spending by over $1 billion, with completion targeted for the first half of 2025.

In the last year, AWS partnered with HCLTech to help enterprises explore GenAI use cases, PoC, and solutions, as well as Anthropic, by becoming its primary cloud partner with a $4 billion investment. 

Following partnerships and developments in the AI and cloud services sectors, Pinecone, a leading knowledge platform for building accurate and scalable AI applications, announced the integration of industry-first inference capabilities into its vector database, raising a total of $138 million in funding.

]]>
Discover the Next Big Thing in AI at AWS AI Conclave 2025 https://analyticsindiamag.com/ai-highlights/discover-the-next-big-thing-in-ai-at-aws-ai-conclave-2025/ Mon, 30 Dec 2024 08:52:53 +0000 https://analyticsindiamag.com/?p=10152354 Whether you’re a business leader, developer, AI/ML practitioner, or technology enthusiast, this event provides a platform to learn, network, and collaborate.]]>

Amazon Web Services is all set to host the 8th edition of the AWS AI Conclave in Bengaluru on January 24, 2025, at the Sheraton Grand in Whitefield. The conclave promises to bring together industry leaders, innovators, and technology enthusiasts to explore the latest advancements in generative AI, data foundations, machine learning, and cloud technology.

The event is ideal for AI/ML and data CXOs and practitioners, business leaders, founders, technology entrepreneurs, and developers. Sandeep Dutta, president of AWS India and South Asia, will deliver the welcome note at the event.

Why Attend the Amazon AI Conclave 2025?

Attendees can look forward to:

  • Live Keynotes: Hear from AWS leaders about the latest products, capabilities, and features that simplify the adoption of generative AI at scale for companies of all sizes.
  • Technical Sessions: Engage in over 30 technical sessions covering a wide array of topics, from building and scaling generative AI to implementing robust data strategies.
  • Industry and Builder’s Hub: Explore the real-world applications of AI and interact with experts to gain insights into building with AWS and cloud computing.
  • AI Awards: Recognise and celebrate innovations in the AI space, acknowledging organisations and individuals making significant contributions to the field.

REGISTER NOW

AWS  recently concluded its annual flagship event re: Invent 2024 in Las Vegas, where it introduced Amazon Nova, a family of foundation models accessible through Amazon Bedrock. These models are built for tasks such as content generation, video understanding, and developing agentic applications. They are available in six different sizes to cater to diverse requirements.

The Amazon Nova lineup includes:

  • Amazon Nova Micro: A lightning-fast text-to-text model optimised for speed and cost-efficiency.
  • Amazon Nova Lite and Pro: Multimodal models capable of processing text, images, and videos.
  • Amazon Nova Premier: Set to launch in early 2025, it focuses on complex reasoning tasks.
  • Amazon Nova Canvas: A powerful image generation model with advanced editing features.
  • Amazon Nova Reel: A cutting-edge video generation model producing studio-quality content.

AWS announced the general availability of Amazon Bedrock, a fully managed service that provides access to foundation models from leading AI companies via a single API. The cloud giant has also announced new chips, including Trainium2, Graviton 4, and Inferentia, which are challenging NVIDIA. 

AWS claims that Trainium2 offers 30-40% better price performance compared to the previous generation of GPU-based Elastic Compute Cloud (EC2) instances.  Customers and foundation model partners such as Anthropic, Databricks, Adobe, Qualcomm, Poolside, and even Apple are already on board.

The company also unveiled Trn2 UltraServers and the next-generation Trainium3 AI training chip.

In a surprising revelation at AWS re: Invent, Apple’s senior director of machine learning and AI, Benoit Dupin, announced that the company leverages AWS’s custom AI chips for various cloud services. This collaboration has resulted in a 40% efficiency gain for Apple’s services, including Siri, Apple Maps, and Apple Music.

Furthermore, Apple is evaluating Amazon’s latest Trainium2 chip to pre-train its Apple Intelligence models, which could potentially improve efficiency by up to 50%. 

The conclave will feature sessions exploring key updates from AWS re: Invent 2024, such as Amazon Nova’s multimodal capabilities and the performance improvements of Trainium2 chips.

Event Details and Registration

Date: January 24, 2025

Time: 9 am to 5 pm (IST) 

Venue: Sheraton Grand Hotel and Convention Center in Whitefield, Bengaluru. 

This event is ideal for:

  • AI/ML & Data CXOs & Business leaders
  • Founders and Co-founders
  • AI/ML practitioners & Technology Entrepreneurs
  • Developers

Don’t miss this opportunity to be at the forefront of AI innovation and cloud computing. Register now to secure your spot and join a community that drives innovation and business impact through AI.

REGISTER NOW

]]>
AWS Thinks it Can Solve NVIDIA’s Customer Problems https://analyticsindiamag.com/global-tech/aws-thinks-it-can-solve-nvidias-customer-problems/ Tue, 24 Dec 2024 07:28:19 +0000 https://analyticsindiamag.com/?p=10144921 AWS is working with Anthropic to build Project Rainier – a large AI compute cluster powered by thousands of Trainium2 chips.]]>

Amazon Web Services (AWS) is preparing to take on NVIDIA as a strong contender as its 2015 acquisition of Annapurna Labs – an Israeli startup whose name was inspired by the Annapurna mountain range in the Himalayas – is proving to be an advantage. At the AWS re:Invent in Las Vegas, the cloud giant announced new chips, including Trainium2, Graviton 4 and Inferentia.

AWS claims that Trainium2 offers 30-40% better price performance than the previous generation of the graphics processing unit (GPU) based Elastic Compute Cloud (EC2) instances. Customers like Anthropic, Databricks, Adobe, Qualcomm, poolside, and even Apple are already on board. 

“Today, there’s really only one choice on the GPU side, and it’s just NVIDIA,” said Matt Garman, CEO at AWS. “We think that customers would appreciate having multiple choices.”

It is worth noting that AWS recently invested $4 billion in Anthropic, making it the primary cloud provider and training partner. The company also introduced Trn2 UltraServers and the next-generation Trainium3 AI training chip. 

AWS is working with Anthropic to build Project Rainier – a large AI compute cluster powered by thousands of Trainium2 chips. This will help Anthropic develop its models, including optimising its flagship product Claude, to run on Trainium2 hardware.

“This cluster is going to be five times the number of exaflops as the current cluster that Anthropic used to train their leading set of Claude models that are out there in the world,” Garman added. 

On the other hand, OpenAI plans to partner with Taiwan Semiconductor Manufacturing Company (TSMC) and Broadcom to launch its first in-house AI chip by 2026. Meanwhile, OpenAI is also banking on NVIDIA’s Blackwell GPU architecture to scale its o1 model and test time compute. 

Notably, Anthropic CEO Dario Amodei, in a recent podcast, said that the cost of training AI models today can reach up to $1 billion. While models like GPT-4 cost approximately $100 million, he predicts that within the next three years, training costs could escalate to $10 or even $100 billion.

Advantage Trainium? 

According to Garman, Trainium2 delivers 30-40% better price performance than current GPU-powered instances. The new TRN2 instances come equipped with 16 custom-built chips interconnected via NeuronLink, a high-speed and low-latency interconnect. This configuration provides up to 20.8 petaflops of compute from a single node. 

The company also introduced Trn2 UltraServers, which combine four Trn2 servers into a single system and offer 83.2 petaflops of compute power for better scalability. These servers feature 64 interconnected Trainium2 chips. For comparison, NVIDIA’s Blackwell B200 is expected to provide up to 720 petaflops of FP8 performance with a rack of 72 GPUs.

Few expected Apple to use AWS Trainium2 to train its models. Benoit Dupin, Apple’s senior director of machine learning and AI, revealed how deeply the company relies on AWS for its AI and ML capabilities. Dupin accredited the decade-long partnership with AWS for enabling Apple’s innovations like Siri, iCloud Music, and Apple TV. “AWS has consistently supported our dynamic needs at scale and globally.”

Apple has leveraged AWS’s solutions, including Graviton and Inferentia chips. It achieved milestones like a 40% efficiency boost by migrating to Graviton instances. Dupin also teased early success with AWS Trainium2 chips, which could deliver a 50% leap in pre-training efficiency.

“We knew that the first iteration of Trainium wasn’t perfect for every workload, but we saw enough traction to give us confidence we were on the right path,” Garman revealed.

Trainium chips leverage the Neuron SDK, which efficiently optimises AI workloads. It supports deep learning inference and training and seamlessly integrates with TensorFlow and PyTorch while avoiding closed-source dependencies. However, it still faces challenges from NVIDIA’s CUDA.

Switching from NVIDIA to Trainium requires hundreds of hours of testing and rewriting code – a barrier few companies want to cross. Acknowledging this challenge internally, AWS called CUDA the single biggest reason customers stick with NVIDIA.

Meanwhile, Amazon’s cloud rivals, Microsoft and Google, are working on their own AI chips to reduce their reliance on NVIDIA. Google recently announced the general availability of Trillium, its sixth generation of Tensor Processing Unit (TPU). “Using a combination of our TPUs and GPUs, LG AI Research reduced inference processing time for its multimodal model by more than 50% and operating costs by 72%,” said Google chief Sundar Pichai during the recent earnings call.

In a similar way, numerous companies today are competing for NVIDIA’s chip market share, including AI chip startups such as Groq, Cerebras Systems, and SambaNova Systems. 

The AI chip market is projected to hit $100 billion in the upcoming years, and AWS is pouring billions of dollars into Trainium to stake its claim. However, beating NVIDIA’s dominance won’t be an easy feat.

]]>
What the AI is Danish Sait Doing at AWS re:Invent 2024? https://analyticsindiamag.com/ai-features/what-the-ai-is-danish-sait-doing-at-aws-reinvent-2024/ Wed, 04 Dec 2024 07:32:16 +0000 https://analyticsindiamag.com/?p=10142437 "I’m clearly sticking out like a sore thumb here because it’s this unbelievable carnival of technology and people talking about AI.”]]>

At AWS re:Invent 2024, taking place in Las Vegas, a familiar face stood out amid the tech enthusiasts and AI experts—comedian and content creator Danish Sait. Known for his observational humour, Sait’s presence at the conference raised a few eyebrows. Why, of all places, was he at an event that primarily draws IT professionals and innovators?

When AIM journalist Anshika Mathews asked about his unusual appearance at the summit, Sait admitted, “I’m clearly sticking out like a sore thumb here because it’s this unbelievable carnival of technology and people talking about AI.”

 Despite the stark contrast in his field, Sait embraced the experience with enthusiasm. He added, “There are so many Indian people who are obviously the bright minds, and then I think they just threw me in there to just stick out.”

Sait, hailing from a city where IT is booming, explained that his curiosity brought him to the summit. “I just wanted to have fun because I belong to a city where IT is so big, and I was like, okay, cool, let me see what they really do at a conference like this.” His decision to attend came after someone mentioned the event to him, and he thought, “Great, it would be fun to just go check it out.”

As a comedian and content creator, Sait’s approach to AI is an interesting one. When asked how he uses AI in his work, his response was blunt: “I don’t. Hopefully, by the end of this conference, I will be using it.” 

However, Sait’s perspective on AI in the creative process is worth noting. “Artists are a little different, especially with what I do because it’s so observational,” he explained. “Maybe AI will get there someday, but while it isn’t around over there, I’m having a field day making content.”

For Sait, there’s a strong belief that human interpretation still has the upper hand. “Certain things are still best left to a person’s interpretation versus believing that a machine could do it,” he shared, reinforcing the idea that humour and creativity are deeply rooted in the personal touch.

]]>
Apple Intelligence is Nothing without AWS https://analyticsindiamag.com/ai-news-updates/apple-intelligence-is-nothing-without-aws/ Wed, 04 Dec 2024 04:12:00 +0000 https://analyticsindiamag.com/?p=10142415 Apple is also exploring AWS’s Trainium2 chips, with early evaluations suggesting a 50% improvement in pre-training efficiency.]]>

Apple surprised everyone with its presence at AWS re:Invent 2024. During his keynote, AWS chief Matt Garman invited Benoit Dupin, Apple’s senior director of machine learning and AI, on stage to speak about how the company works with Amazon Web Services (AWS) and uses its servers to power its AI and machine learning features.

Dupin said that the partnership with AWS, which spans more than a decade, has been crucial in scaling Apple’s machine learning (ML) and artificial intelligence (AI) capabilities.

Dupin, who oversees machine learning, AI, and search infrastructure at Apple, detailed how the company’s AI-driven features, including Siri, iCloud Music, and Apple TV, rely heavily on AWS’s infrastructure. “AWS has consistently supported our dynamic needs at scale and globally,” Dupin said.

Apple has increasingly leveraged AWS’s solutions, including its Graviton and Inferentia chips, to boost efficiency and performance. Dupin revealed that Apple achieved a 40% efficiency gain by migrating from x86 to Graviton instances. Additionally, transitioning to Inferentia 2 for specific search-related tasks enabled the company to execute features twice as efficiently.

This year, Apple launched Apple Intelligence, which integrates AI-driven features across iPhone, iPad, and Mac. “Apple Intelligence is powered by our own large language models, diffusion models, and adapts on both devices and servers,” Dupin said. Key features include system-wide writing tools, notification summaries, and improvements to Siri, all developed with a focus on user privacy.

To support this innovation, Apple required scalable infrastructure for model training and deployment. Dupin said, “AWS services have been instrumental across virtually all phases of our AI and ML lifecycle,” including fine-tuning models and building adapters for deployment. Apple is also exploring AWS’s Trainium2 chips, with early evaluations suggesting a 50% improvement in pre-training efficiency.

“AWS expertise, guidance, and services have been critical in supporting our scale and growth,” Dupin said.

Previously, Apple revealed that it uses Google’s Tensor Processing Units (TPUs) instead of the industry-standard NVIDIA GPUs for training its AI models. This information was disclosed in a technical paper published by Apple on Monday, outlining the company’s approach to developing its AI capabilities.

At AWS re:Invent 2024, Amazon Web Services (AWS) has announced the general availability of AWS Trainium2-powered Amazon Elastic Compute Cloud (EC2) instances. The new instances offer 30-40% better price performance than the previous generation of GPU-based EC2 instances.

]]>
AWS had a Hard Time Fitting in All of Bedrock’s Innovations at re:Invent 2024  https://analyticsindiamag.com/ai-news-updates/aws-had-a-hard-time-fitting-in-all-of-bedrocks-innovations-at-reinvent-2024/ Tue, 03 Dec 2024 19:35:20 +0000 https://analyticsindiamag.com/?p=10142407 “Bedrock gives you everything you need to integrate generative AI into production applications, not just proof of concepts,” says AWS CEO Matt Garman. ]]>

At AWS re:Invent in Las Vegas, Amazon Web Services (AWS) has announced exciting updates to Amazon Bedrock, its platform for creating and running AI applications. 

“One of the hardest parts was figuring out how much we could fit in,” resonated AWS chief Matt Garman, reflecting on the sheer scale of advancements in Bedrock. “Fortunately, Swami will dive deeper into a ton more during his keynote tomorrow.” 

Garman said that Bedrock is by far the easiest way to build and scale generative AI applications. 

One big addition to Bedrock includes Automated Reasoning Checks, a tool designed to stop AI from making factual mistakes, aka hallucinations. This is especially useful for industries like healthcare and finance, where accuracy is critical. “Automated reasoning checks prevent factual errors due to model hallunciations,” said Garman. 

AWS further claimed that it helps ensure AI provides correct and trustworthy answers without needing advanced AI expertise. 

For example, PwC is using Automated Reasoning checks to build accurate, trustworthy AI assistants and agents to drive its clients’ businesses to the leading edge.

In addition to this, AWS announced the launch of Model Distillation. This lets users shrink large AI models into smaller ones without losing much accuracy. Smaller models are faster and cheaper to run. “Model Distillation in Bedrock delivers models that are 500% faster and 75% cheaper,” shared Garman. 

For instance, Robin AI is already using this to save money while providing quick, accurate answers for legal questions.

With Amazon Bedrock Model Distillation, customers can choose the optimal model for their use case and a smaller model from the same family, balancing application latency with cost efficiency. The company claimed that it works best with models from Anthropic and Meta, alongside its latest in-house Nova-series of models. 

“With a broad selection of models, leading capabilities that make it easier for developers to incorporate generative AI into their applications, and a commitment to security and privacy, Amazon Bedrock has become essential for customers who want to make generative AI a core part of their applications and businesses,” said Dr. Swami Sivasubramanian, vice president of AI and Data at AWS. 

AWS also showcased Bedrock’s ability to manage and coordinate multiple AI agents for large-scale, complex workflows. “Bedrock agents can now support complex workflows with agent collaboration, enabling seamless coordination for sophisticated tasks,” shared Garman. 

Moody’s uses Amazon Bedrock’s multi-agent system to improve risk analysis, with each agent handling specific tasks. This makes their assessments faster and more accurate, strengthening their position as a financial leader.

What’s Next? 

Garman said while generative AI is still in its early stages, Bedrock is positioning itself as a leader by offering innovative tools and models that address real-world challenges. 

“This is just a sampling of the new capabilities that we’re announcing this week. Bedrock gives you the best models, the right tools, and capabilities you cannot get anywhere else,” added Garman in his keynote address on the future vision of Bedrock. 

]]>
AWS Unveils Trainium2, Slashes AI Cost by 40% https://analyticsindiamag.com/ai-news-updates/aws-unveils-trainium2-slashes-ai-cost-by-40/ Tue, 03 Dec 2024 19:15:58 +0000 https://analyticsindiamag.com/?p=10142408 The cloud giant also introduced Trn2 UltraServers and unveiled its next-generation Trainium3 AI chip. ]]>

At AWS re:Invent 2024, Amazon Web Services (AWS) has announced the general availability of AWS Trainium2-powered Amazon Elastic Compute Cloud (EC2) instances. The new instances offer 30-40% better price performance than the previous generation of GPU-based EC2 instances. “Today, I’m excited to announce the GA of Trainium2-powered Amazon EC2 Trn2 instances,” said AWS chief Matt Garman

In addition to this, the company also introduced Trn2 UltraServers and unveiled its next-generation Trainium3 AI chip. 

The Trn2 instances are built with 16 Trainium2 chips, delivering up to 20.8 petaflops of compute performance. They are intended for training and deploying large language models (LLMs) with billions of parameters. 

Trn2 UltraServers combine four Trn2 servers into a single system, offering 83.2 petaflops of compute for higher scalability. These new UltraServers feature 64 interconnected Trainium2 chips.

“The launch of Trainium2 instances and Trn2 UltraServers provides customers with the computational power needed to tackle the most complex AI models, whether for training or inference,” said David Brown, AWS vice president of compute and networking.

AWS is working with Anthropic to create Project Rainier, a large-scale AI compute cluster powered by hundreds of thousands of Trainium2 chips. This infrastructure will support Anthropic’s model development, including the optimisation of its flagship product, Claude, to run on Trainium2 hardware.

Databricks and Hugging Face have partnered with AWS to leverage Trainium2’s capabilities for improved performance and cost efficiency in their AI offerings. Databricks plans to utilise the hardware to enhance the Mosaic AI platform, while Hugging Face integrates Trainium2 into its AI development and deployment tools.

Other customers of Trainium2 include Adobe, Poolside, and Qualcomm. “Adobe is seeing very promising early testing after running Trainium2 against their Firefly inference model, and they expect to save significant amounts of money,” said Garman.

“Poolside expects to save 40% compared to alternative options,” he added. “Qualcomm is using Trainium2 to deliver AI systems that can train in the cloud and then deploy at the edge.”

AWS also previewed its Trainium3 chip, built using a 3-nanometer process node. Trainium3-powered UltraServers are expected in late 2025 and aim to deliver four times the performance of Trn2 UltraServers.

To optimise the use of Trainium hardware, AWS also introduced the Neuron SDK, a suite of software tools that enables developers to optimize their models for peak performance on Trainium chips. The SDK supports frameworks such as JAX and PyTorch, allowing customers to integrate the software into their existing workflows with minimal code changes. 

The Neuron SDK also supports over 100,000 models hosted on the Hugging Face model hub, further enhancing its accessibility for AI developers.

Trn2 instances are currently available in the US East (Ohio) region, with expansion to additional regions planned. UltraServers are in preview mode.

]]>
AWS Unveils Aurora DSQL Breaking Free from the Tyranny of Trade-offs https://analyticsindiamag.com/ai-news-updates/aws-unveils-aurora-dsql-breaking-free-from-the-tyranny-of-trade-offs/ Tue, 03 Dec 2024 18:30:00 +0000 https://analyticsindiamag.com/?p=10142403 Razorpay looks to use Aurora DSQL for building highly resilient financial applications that require strong consistency and rapid scaling.]]>

AWS has launched Amazon Aurora DSQL at the AWS re:Invent 2024 in Las Vegas, and AIM is covering the development from the ground up. 

Amazon Aurora DSQL outpaces competitors, including Google Spanner, CockroachDB, YugabyteDB and others, delivering 4x faster reads and writes, 99.999% multi-region availability, and zero infrastructure management. 

“Aurora DSQL’s active-active architecture and automated failure recovery ensure that a customer’s application is always available by enabling an application to read and write to any Aurora DSQL endpoint,” read the official press release from Amazon. 

Amazon also says that transactions from one Region are now automatically reflected in others with ‘strong consistency’. It also eliminates the need to provision, patch, or manage database instances, and any updates or security patches will occur with no downtime. Amazon DSQ is also compatible with PostgreSQL. 

Amazon also mentions that it had to reinvent the way a relational database processed transactions. In order to overcome the challenge of achieving strong multi-Region consistency, Aurora DSQL ‘decouples transaction procession from storage’, mitigating limitations of current approaches, ‘which were constrained by information being passed back and forth multiple times at the speed of light’.

Amazon has also integrated a tool called Amazon Time Sync Service, which adds ‘hardware reference clocks’ on Amazon EC2 cloud instances, to synchronise them with satellite-powered atomic clocks to provide ‘microseconds level accurate time’. 

Matt Garman, CEO at AWS revealed when he asked customers what is needed for a perfect database, there are a lot of ‘ands’, and the current competition focuses on just the ‘ors’. It’s either multi-region capability, or high consistency, or low latency. With Aurora, he says Amazon is focusing on whatever it takes to build the perfect serverless, distributed SQL database. 

Interestingly, Amazon also mentioned that Razorpay, one of India’s leading payment gateway services is planning to use Aurora DSQL. “Aurora DSQL will help Razorpay achieve multi-Region strong consistency, which is critical for financial use cases that require high degrees of precision, for their applications while operating more efficiently at a global scale,” added Amazon in the announcement. 

Other companies planning to implement Aurora DSQL include Electronic Arts, Klarna, QRT, and Autodesk. JP Morgan Chase is also using Aurora DSQL to enable low-latency, strongly consistent data sharing across global operations, driving real-time analytics and cloud-first modernisation.

Apart from the new capability to Aurora, Amazon also added new updates to DynamoDB. It will now support strong multi-region consistency, ensuring users with multi-region applications always access the latest data without having to change any code. 

Ganapathy ‘G2’ Krishnamoorthy, VP of Database Services at AWS, said, “Aurora removes the need for customers to make trade-offs by providing the performance of enterprise-grade commercial databases with the flexibility and economics of open source. 

“Now, we’re reimagining the relational database again to deliver strong consistency, global availability, and virtually unlimited scalability, without having to choose between low latency or SQL,” he added. 

It will be interesting to see how Amazon will compete with the competition given that Microsoft’s Azure and Google Cloud offer serverless SQL database capabilities. 

“The options available today force trade-offs. Some provide low latency and high availability, but not strong consistency or SQL compatibility. Others provide strong consistency and high availability, but can’t avoid very high latency and still don’t offer SQL compatibility,” said Amazon regarding the competition for Aurora’s DSQL. 

With this release, AWS celebrates 10 years of Aurora, which has grown to support hundreds of thousands of customers, thanks to innovations like serverless architecture and vector capabilities for AI.

]]>
AWS Becomes Anthropic’s Primary Cloud Partner with $4 Billion Investment https://analyticsindiamag.com/ai-news-updates/aws-becomes-anthropics-primary-cloud-partner-with-4-billion-investment/ Fri, 22 Nov 2024 15:30:16 +0000 https://analyticsindiamag.com/?p=10141491 Anthropic is working with AWS’s Annapurna Labs to develop next-gen Trainium hardware and improve the AWS Neuron software stack.]]>

Amazon Web Services (AWS) has announced a $4 billion investment in Anthropic, making AWS the primary cloud provider and training partner for the company.

This investment brings Amazon’s total funding in Anthropic to $8 billion, maintaining a minority stake while deepening technical and commercial ties. To date, Anthropic has raised a total of $13.7 billion in venture capital, according to Crunchbase.

The partnership will accelerate the development of Anthropic’s advanced AI systems, including optimising AWS’s Trainium accelerators for machine learning workloads.

Anthropic and AWS’s Annapurna Labs are collaborating on future generations of Trainium hardware, focusing on low-level kernel development and AWS Neuron software stack improvements. 

“Through deep technical collaboration, we’re writing low-level kernels that allow us to directly interface with the Trainium silicon, and contribute to the AWS Neuron software stack to strengthen Trainium,” said Anthropic in its blog post.

“Our engineers work closely with Annapurna’s chip design team to extract maximum computational efficiency from the hardware, which we plan to leverage to train our most advanced foundation models,” the company added 

This hardware-software integration is aimed at maximising computational efficiency to support Anthropic’s foundation models.

According to a recent report, Amazon is planning to roll out its latest AI chip, Trainium 2, in the coming month. It has been already tested by Anthropic, Databricks, and Deutsche Telekom. Trainium 2 is part of Amazon’s larger strategy to optimise its data centre performance while reducing costs for Amazon Web Services (AWS) customers.

Anthropic said its Claude models, accessible through Amazon Bedrock, have become essential for businesses like Pfizer, Intuit, and Perplexity. These companies use Claude for tasks such as streamlining medical research, simplifying tax calculations, and enhancing AI-powered search. The European Parliament uses Claude to efficiently analyse millions of documents.

AWS’s security features allow organizations to deploy Anthropic’s AI models securely in environments such as AWS GovCloud and SageMaker for classified tasks. Government clients can access Claude’s capabilities through regulated cloud services, ensuring compliance with stringent requirements.

The company recently partnered with Palantir to provide its advanced AI model Claude to the US government for data analysis, and complex coding activities in projects of national security interest.

Anthropic rival OpenAI recently raised $6.6 billion at a valuation of $157 billion. The funding, led by existing investor Thrive Capital, has brought OpenAI’s total capital raised to $17.9 billion, according to Crunchbase. Thrive Capital contributed approximately $1.3 billion, with the option to invest an additional $1 billion at the same valuation through 2025.

Other major participants in this funding round include Microsoft, NVIDIA, SoftBank, Khosla Ventures, Altimeter Capital, Fidelity, and MGX.

Anthropic  recently released an upgraded Claude 3.5 Sonnet model and the new Claude 3.5 Haiku, along with a public beta for an experimental feature called “computer use.” 

Over the last few months, it has also released several updates to its 3.5 series of Claude models, introducing impactful new features like Claude Artifacts, the Analysis tool, and Visual PDF.

In the latest podcast episode with Lex Fridman, Anthropic CEO Dario Amodei revealed that their Opus model isn’t going anywhere, adding that Anthropic will release the much-anticipated update and launch Claude 3.5 Opus. He further said that Claude 4.0 will be released according to the usual business cycle.

]]>
How AI Chips Stole the Spotlight in 2024 https://analyticsindiamag.com/ai-features/how-ai-chips-stole-the-spotlight-in-2024/ Fri, 22 Nov 2024 06:30:00 +0000 https://analyticsindiamag.com/?p=10141398 In the race to power AI applications, inference chips are the unsung heroes driving real-time decisions, from chatbots to recommendation engines]]>

When discussing AI, one often thinks about excellent software and intelligent programs. Behind the scenes, however, is a vast world of hardware that makes it all possible. Think of AI hardware and chip-making companies as the backstage crew of a big production, ensuring everything is perfectly aligned so AI can shine. 

Big tech companies like NVIDIA and AMD make special chips that power everything from driverless cars to smart gadgets. 

Companies are always trying to build faster and more powerful chips because AI needs a lot of power to perform its job seamlessly. The race to make the best chip intensifies as AI becomes more advanced. Every tiny improvement makes a big difference.

However, starting a hardware company is no easy feat. 

In India, a Surat-based company, Vicharak, took on the herculean task of churning out hardware in-house designed specifically for AI workloads. The company recently secured funding of ₹1 crore, boosting its valuation to ₹100 crore. 

Speaking with AIM, founder and CEO Akshar Vastarpara said that Vicharak’s focus is on creating hardware and redefining computing technology. 

“Our first target is to develop a GPU-like technology that can be used in mobile phones, laptops, and servers. We are approaching this in a very different way, starting with the consumer base but scaling to servers and lower-level areas as well,” Vastarpara explained.

This approach led to the creation of Vaaman, a compact computing board featuring a six-core ARM CPU and a field-programmable gate array (FPGA) with 1,12,128 logic cells. Its unique design handles challenges beyond current products, offering 300-Mbps FPGA-CPU connectivity for superior hardware acceleration and parallel computing.

The unique condition garnered a lot of attention on social media.

In this article, AIM explores the importance of AI chips and the most effective strategies that have enhanced their performance in 2024.

The Inference Power Players

In the race to power AI applications, inference chips are the unsung heroes driving real-time decisions, from chatbots to recommendation engines. These specialised processors are the backbone of modern AI, delivering speed and efficiency where it matters most.

To further extend creative possibilities, NVIDIA rolled out its highly anticipated H200 Tensor Core GPU, a successor to the H100, designed for generative AI and high-performance computing workloads. It introduced a faster memory (HBM3E) for improved efficiency​. 

Then came the B100 GPU, which utilised the new Blackwell architecture to focus on AI training and inference. This chip is tailored for AI training and inference, continuing NVIDIA’s focus on accelerating AI advancements​. 

Earlier this year, NVIDIA launched its GH200 chip, combining a GPU and an ARM-based CPU. By October, OpenAI announced receiving the first engineering builds of NVIDIA’s DGX B200 on X.

Notably, NVIDIA CEO Jensen Huang personally delivered the first GPU chip to Elon Musk and presented the first DGX H200 to OpenAI’s Sam Altman and Greg Brockman.

In a similar vein, Microsoft announced that its Azure platform became the first cloud service to implement NVIDIA’s Blackwell system, featuring AI servers powered by the GB200.  NVIDIA reported generating a record-breaking $22.6 billion in data centre revenue this year, a 23% sequential and 427% year-over-year growth, fueled by demand for the Hopper GPU platform. During the earnings call, Huang hinted at future advancements, stating, “After Blackwell, there’s another chip. We’re on a one-year rhythm.”

On the other hand, Google’s parent company Alphabet released two notable AI chips, including the Cloud TPU v5p. The chips were specifically engineered for training LLMs and GenAI with each TPU v5p pod containing 8,960 chips and a bandwidth of 4,800 Gbps. Google also launched Trillium, a high-performance chip for AI data centres offering nearly five times the speed of its predecessor TPU v5e.

​Both chips are integral to Google Cloud’s AI infrastructure, reinforcing Alphabet’s competitive edge in the AI chip market alongside its broader investments in custom hardware.

AMD announced the MI325X AI chip and introduced the series in June 2024. The company created its next generation of Epyc and Ryzen processors and released its latest product — the Zen 5 CPU microarchitecture.

The company launched the MI300A and MI300X AI chips. The MI300A combines a GPU with 228 compute units and 24 CPU cores, while the MI300X is a GPU model featuring 304 compute units. The MI300X and Nvidia’s H100 compete in memory capacity and throughput.

AIM earlier talked about the integration of AMD’s EPYC CPUs with NVIDIA’s HGX and MGX GPUs, which enriches AI and data centre performance while supporting open standards for greater flexibility and scalability.

Similarly, AWS has switched its focus from cloud infrastructure to chips. Its Elastic Compute Cloud Trn1 instances are purpose-built for deep learning and large-scale generative models. They function using AWS Trainium chips and AI accelerators.

The trn1.2xlarge instance was the first iteration. It only had one Trainium accelerator, 32 GB of instance memory and 12.5 Gbps network bandwidth. Now, Amazon has introduced the trn1.32xlarge instance, which has 16 accelerators, 512 GB of instance memory and 1,600 Gbps ability. The company is planning to roll out its latest AI chip, Trainium 2, in the upcoming month. As the Financial Times reported, the chip will likely support targeting AI model training at scale.

“The second version of Trainium – Trainium 2 – will start to ramp up in the next few weeks, and I think it’s going to be very compelling for customers on a price-performance basis,” said former AWS chief Andy Jassy

The report further revealed that Amazon’s other AI chip, Inferentia, saves customers approximately 40% on costs for generating responses from AI models. 

In a bid to keep pace with the growing demand for semiconductors capable of training and deploying large AI models, Intel announced its latest AI chip Gaudi 3 at Intel Vision 2024.

The chip, first revealed by CEO Pat Gelsinger at the Intel AI Everywhere event, has double the power efficiency of its predecessor and is capable of running AI models 1.5 times faster than NVIDIA’s H100 GPU. 

It offers various configurations, including a bundle of eight Gaudi 3 chips on one motherboard or a card that can be integrated into existing systems.

Gaudi 3, built on a 5 nm process, signals Intel’s use of manufacturing techniques. According to Gelsinger, Intel plans to manufacture AI chips, potentially for external companies, at a new Ohio factory, which is expected to open in the upcoming years.

On the Edge of Innovation

Training and edge AI chips are the secret sauce fueling AI’s learning process, whether in the cloud or directly on your device. These chips transform raw data into actionable intelligence, driving AI’s next big leap.

American AI company Cerebras Systems, in collaboration with Abu Dhabi-based AI holding company G42, announced the development of Condor Galaxy 3 (CG-3), the latest addition to their AI supercomputing constellation, in 2024.

CG-3 features 64 of Cerebras’ newly launched CS-3 systems, each powered by the WSE-3 chip. It will be available by Q2 2024 and is set to deliver eight exaFLOPs of AI computing power. This marks the third generation of AI supercomputers released by Cerebras Systems in collaboration with G42.

The CS-3 chip, also unveiled by Cerebras, boasts 4 trillion transistors and offers 125 petaflops of peak AI performance per chip. The WSE-3 is designed to double the performance of its predecessor while maintaining the same power consumption and price, making it ideal for training the industry’s most significant AI models.

This announcement follows the release of the second phase of the Condor Galaxy supercomputer, known as Condor Galaxy 2, last November. 

Apple Neural Engine, specialised cores based on Apple chips, furthered the company’s AI hardware design and performance. The neural engine led to the M1 chip for MacBooks. Compared to the generation before, MacBooks with an M1 chip are 3.5 times faster in general performance and five times faster in graphic performance.

After the success of the M1 chip, the company announced further generations. As of 2024, Apple released the M4 chip, but it is only available in iPad Pro. The M4 chip has a neural engine that is three times faster than the M1 chip and CPU that is 1.5 times faster than the M2.

“The new iPad Pro with M4 is a great example of how building best-in-class custom silicon enables breakthrough products,” said Johny Srouji, Apple’s senior vice president of hardware technology. 

After the success of its first specialised AI chip, Telum, IBM introduced its Telum II Processor in August. This processor is designed to power the next-generation IBM Z systems. In addition, IBM unveiled the Spyre Accelerator at the Hot Chips 2024 conference. These chips are likely to become available in 2025.

Clearly, they are determined to design a powerful successor that can outpace its competitors.

Currently, IBM is working on the NorthPole AI chip, which does not have a release date. 

]]>
AWS Launches Multi-Agent Orchestrator for Managing AI Agents https://analyticsindiamag.com/ai-news-updates/aws-launches-multi-agent-orchestrator-for-managing-ai-agents/ Mon, 18 Nov 2024 12:51:16 +0000 https://analyticsindiamag.com/?p=10141115 Developers can use pre-built scripts and demo applications, such as a chatbot for specialised queries or an AI-powered e-commerce support system.]]>

Amazon Web Services(AWS) has introduced Multi-Agent Orchestrator, a framework, that offers a solution for managing multiple AI agents and handling complex conversations. 

The framework routes queries to the most suitable agent, maintains conversational context, and integrates with various environments, including AWS Lambda, local setups, and other cloud platforms.

The orchestrator supports Python and TypeScript, enabling dual-language implementation. It allows for both streaming and non-streaming responses from agents and includes pre-built options for rapid deployment. Additionally, it provides extensive features like intelligent intent classification, context management, and scalable integration of new agents.

AWS also published a demo on the GitHub repository highlighting its capabilities with six specialised agents, including ones for travel, weather, math, and health. The orchestrator switches between agents to manage multi-turn conversations and diverse tasks while preserving context.

Moreover, developers can also use pre-built scripts and demo applications, such as a chatbot for specialised queries or an AI-powered e-commerce support system. Sample projects include a multilingual chatbot for flight reservations and an AI customer support system balancing automated responses with human oversight.

The framework also supports voice-based interactions using Amazon Connect and Lex, showcasing its versatility in addressing a variety of use cases. With the ability to integrate with tools like Bedrock LLMs and Lex Bots, the orchestrator is positioned as a flexible choice for enterprises managing complex AI deployments.

As AI moves towards an agentic future, several multi-agent frameworks have been introduced recently.

Microsoft Research unveiled Magentic-One, a generalist multi-agent system capable of solving open-ended tasks across diverse domains. Available as an open-source tool on Microsoft AutoGen, Magentic-One helps developers and researchers create agentic applications for managing complex, multi-step tasks autonomously.

OpenAI introduced Swarm, a framework for building, orchestrating, and deploying multi-agent systems.

Similarly, IBM launched the Bee Agent Framework, an open-source toolkit for creating and deploying agent-based workflows at scale. Currently, in its alpha stage, Bee Agent supports various AI models and offers compatibility with IBM Granite and Llama 3.2 models.

The framework enables developers to build efficient agents with minimal modifications to existing implementations while optimising for other widely used LLMs.

]]>
In Anthropic We Trust  https://analyticsindiamag.com/ai-features/in-anthropic-we-trust/ Wed, 13 Nov 2024 11:30:00 +0000 https://analyticsindiamag.com/?p=10140912 CEO Dario Amodei has repeatedly revealed his ambition to use Claude to support the US government and its interests in protecting national security. ]]>

Over the last few days, we have seen a series of announcements highlighting generative AI firms forming partnerships with the US government to provide AI technology for military and defence. Anthropic is definitely on top of that list. Not only is it big tech’s favourite child, but it has also secured its place in the public sector and government organisations.

Recently, the company partnered with Palantir to provide its advanced AI model Claude to the US government for data analysis, and complex coding activities in projects of national security interest. This partnership involves an IL6 accreditation, just one level below the top secret tier. 

It didn’t take long for the partnership to spark a debate around the company’s commitment to building AI responsibly, especially as its CEO, Dario Amodei, is well-known for his views on building an AI that prioritises safety. 

Recently, Anthropic released a statement urging governments to take action and bring in regulations to enforce the safe and ethical use of AI. “Governments should urgently take action on AI policy in the next eighteen months. The window for proactive risk prevention is closing fast,” said Anthropic.

Moreover, Anthropic also hired a full-time AI welfare expert to explore the moral and ethical implications of AI. People were quick to question whether Amodei and Anthropic’s views on AI were mere virtue signalling and were disappointed about their partnership with the US government. 

The announcement also came just days before the election results in the US, where Donald Trump is set to take charge as the 47th President. These concerns stem from Trump’s desire to loosen AI regulations. His allies have drafted an order to rapidly maximise AI usage for defence. 

No Surprise Moves

It is premature to defend or criticise Anthropic. They have played the game fair and square throughout, at least in terms of transparency. Amodei, on multiple occasions, revealed his ambition to use Claude to support the government and its interests in protecting national security. 

In his recent essay ‘Machines of Loving Grace’, Amodei said, “On the international side, it seems very important that democracies have the upper hand on the world stage when powerful AI is created.”

“AI-powered authoritarianism seems too terrible to contemplate, so democracies need to be able to set the terms by which powerful AI is brought into the world, both to avoid being overpowered by authoritarians and to prevent human rights abuses within authoritarian countries,” he added.

Anthropic has also been quite transparent about the same and has revealed its intent to provide its technology for government use. Earlier in June, it revealed its plans to expand Claude’s access for government use. Anthropic made its Claude models available on the AWS marketplace for the US Intelligence Community.

“Claude offers a wide range of potential applications for government agencies, both in the present and looking towards the future. Government agencies can use Claude to provide improved citizen services, streamline document review and preparation, enhance policymaking with data-driven insights, and create realistic training scenarios,” said Anthropic in a statement

At the same time, Anthropic proposed amendments to California’s Senate Bill 1047 (SB 1047). Notably, the proposed amendments include exempting US military and intelligence operations from liability for “critical harms”.

Walking on a Tightrope

Anthropic also intends to strike a balance between its two ambitions. This year, Anthropic partnered with the US Artificial Intelligence Safety Institute (AISI US) and has also been working with AISI UK to test its models for safety. Last year, Anthropic developed a ‘Constitutional AI’, to align its LLMs to “abide by high-level normative principles written into a constitution”. 

In September 2023, Anthropic published a responsible scaling policy, a series of protocols, and security levels. “Our RSP defines a framework called AI Safety Levels (ASL) for addressing catastrophic risks, modelled loosely after the US government’s biosafety level (BSL) standards for handling of dangerous biological materials,” read the report.

With its commitment to ethics and morals, Anthropic wants to be the first to foster a strong relationship with the government. Its updated usage policy introduced an exception that will allow governments to use their model, while also stating that it will continue to prevent any activities that are morally questionable. 

“With carefully selected government entities, we may allow foreign intelligence analysis in accordance with applicable law. All other use restrictions in our usage policy, including those prohibiting use for disinformation campaigns, the design or use of weapons, censorship, domestic surveillance, and malicious cyber operations, remain,” Anthropic wrote in the statement.  

In comparison, OpenAI hasn’t been actively partnering with the government. However, some reports surfaced claiming it was ‘quietly’ pitching its tech to the government. Several employees of OpenAI, including many from the safety team, have also left the company. 

Actions Speak

One of Anthropic’s major investors is Amazon, and they are also set to raise another round of funds. As mentioned, Anthropic recently made Claude available on the AWS market.

Most public sector technology is hosted on AWS, and Amazon, one of the biggest companies in the US, certainly benefits from close ties with the government. 

Walking the talk pays off. Anthropic has consistently championed safety and security, earning trust and partnerships with public sector companies. In contrast, OpenAI introduced these priorities later, making building trust harder. 

]]>
AWS to Soon Roll Out Trainium 2 Chip to Scale Anthropic’s Claude https://analyticsindiamag.com/ai-news-updates/aws-to-soon-roll-out-trainium-2-chip-to-scale-anthropics-claude/ Wed, 13 Nov 2024 07:11:45 +0000 https://analyticsindiamag.com/?p=10140886 OpenAI partners with TSMC and Broadcom to launch the first in-house AI chip in 2026. ]]>

Amazon is planning to roll out its latest AI chip, Trainium 2, in the coming month, most likely to support targeting AI model training at scale, the Financial Times reported

Already tested by partners such as Anthropic, Databricks, and Deutsche Telekom, Trainium 2 is part of Amazon’s larger strategy to optimise its data centre performance while reducing costs for Amazon Web Services (AWS) customers.

“The price of cloud computing tends to be much larger for machine learning and AI,” explained AWS vice president Dave Brown, adding that savings of 40% on large AI workloads can significantly impact customer choices. 

AWS chief Andy Jassy said, “The second version of Trainium, Trainium 2, will start to ramp up in the next few weeks, and I think it’s going to be very compelling for customers on a price-performance basis.”

Read: GenAI Boom Bleeds Users, Fills AWS, Azure, GCP’s Coffers 

The report further said that Amazon’s other AI chip, Inferentia, is reported to save customers approximately 40% on costs for generating responses from AI models. 

Recently, reports surfaced that AWS plans to invest more in the AI startup Anthropic. However, the company has set one condition, requiring Anthropic to use a large number of servers powered by AI chips developed in-house by Amazon.

Meanwhile, NVIDIA is building next level of computing capabilities for its customers, including OpenAI and others, where they are looking to build next level of test-time computing chips for o1-like models. 

In a recent podcast with No Priors, NVIDIA chief Jensen Huang shared that one of the major challenges NVIDIA is currently facing in computing is inference time scaling, which involves generating tokens at incredibly low latency.

Huang explained that, in the future, AI systems will need to perform tasks like tree search, chain of thought, and mental simulations, reflecting on their own answers. The model would prompt itself and generate text internally, all while responding in real-time, ideally within a second. This approach subtly points to the capabilities of the o1 system.

Meanwhile, OpenAI plans to partner with TSMC and Broadcom to launch its first in-house AI chip by 2026. This move comes after the startup began exploring a new method to scale up its models, particularly o1, using the test-time compute approach.

AWS on AI Chip Expansion Mode 

AWS recently announced a $110 million investment to support university-led research in generative AI through its new “Build on Trainium” program. This initiative provides compute hours and AWS Trainium credits, giving researchers access to Trainium UltraClusters for large-scale AI research, covering topics from AI architecture to machine learning (ML) library development.

The Build on Trainium program aims to advance AI research by offering access to up to 40,000 Trainium chips, facilitating work on distributed systems, algorithmic improvements, and AI accelerator performance. AWS developed Trainium as a specialised chip for deep learning and inference, enabling high-performance AI experiments previously limited by budget constraints.

As part of the program, AWS will conduct rounds of Amazon Research Awards calls for proposals. Selected institutions receive AWS Trainium credits and access to resources for exploring innovations in AI. 

Participants include prominent research institutions, such as Carnegie Mellon University (CMU) and the University of California at Berkeley, focusing on ML systems and compiler optimisations.

Build on Trainium also offers training and resources to grant recipients. AWS provides technical education and connects researchers with the Neuron Data Science community, fostering knowledge sharing among AWS specialists, startups, and the Generative AI Innovation Center.

]]>
AWS, Anthropic, Palantir Join Forces to Bring Gen AI to US Defense https://analyticsindiamag.com/ai-news-updates/aws-anthropic-and-palantir-join-forces-to-bring-generative-ai-to-us-defense-and-intelligence/ Thu, 07 Nov 2024 18:19:25 +0000 https://analyticsindiamag.com/?p=10140594 Meanwhile, OpenAI has yet to make similar announcements despite its cautious strategy of withholding major releases, such as GPT-5. ]]>

Anthropic, Palantir Technologies, and AWS recently announced partnership to provide US intelligence and defense agencies with Claude 3 and 3.5 models, integrated within Palantir’s AI Platform (AIP) and supported by AWS. With this, the trio looks to enable rapid data analysis, improved pattern recognition, and enhanced document review to support critical government functions.

Dave Levy, VP of worldwide public sector at AWS said that they are excited to partner with Anthropic and Palantir and offer new generative AI capabilities that will drive innovation across the public sector.

“Our partnership with Anthropic and AWS provides US defense and intelligence communities the tool chain they need to harness and deploy AI models securely, bringing the next generation of decision advantage to their most critical missions,” said Shyam Sankar, chief technology officer at Palantir.  

This collaboration leverages Palantir’s IL6-accredited AIP and AWS’s SageMaker for highly secure, agile, and efficient AI deployment. Both Palantir and AWS have received the Defense Information Systems Agency (DISA) IL6 accreditation, which demands some of the highest information security standards.

“Access to Claude 3 and Claude 3.5 within Palantir AI platform on AWS will equip US defense and intelligence organisations with powerful AI tools that can rapidly process and analyse vast amounts of complex data. This will dramatically improve intelligence analysis and enable officials in their decision making processes.” said Kate Earle Jensen, head of sales and partnerships at Anthropic. 

Anthropic In, OpenAI? 

The announcement comes just days after Meta announced the availability of Llama open source models to the US government and other private projects working towards the interests of the USA’s national security. 

Moreover, newly elected President Trump’s allies have drafted an order that advocates for an unprecedented use of artificial intelligence technologies for military purposes and applications. 

With Trump insisting on loosening regulations and guardrails for AI, it will be interesting to see how the ecosystem evolves, and we’re likely to come across many such partnerships in the future. 

Ahead of Trump’s victory, Anthropic had announced its intent to expand access to Claude’s AI capabilities to support government initiatives. The blog post also mentions that Claude will allow ‘carefully selected government agencies’ to legally use it for intelligence operations. 

At the time, Amodei had emphasised the importance of responsible AI deployment, stating, “It makes democracy as a whole more effective, and if we provide them poorly, it undermines the notion of democracy.”

Meanwhile, OpenAI has yet to make similar announcements despite its cautious strategy of withholding major releases, such as GPT-5, during election periods to avoid influencing outcomes. 

Now that the US election has concluded with Trump’s victory, the question remains: will OpenAI finally release GPT-5? Only time will tell. “It is critically important that the US maintains its lead in developing AI with democratic values,” said Altman, in a post on X, congratulating Trump.  

]]>
​​GenAI Boom Bleeds Users, Fills AWS, Azure, GCP’s Coffers https://analyticsindiamag.com/ai-trends/genai-boom-bleeds-users-fills-aws-azure-gcps-coffers/ Tue, 05 Nov 2024 09:22:28 +0000 https://analyticsindiamag.com/?p=10140198 Amazon will spend $75 billion in 2024, focussing on infrastructure, data centres, and other resources essential for running AWS. ]]>

As we wrap up another quarter, AIM reviews how different cloud service providers have performed. In the latest quarter, Amazon Web Services (AWS) earned $27.5 billion in revenue, reporting a 19% increase from the same quarter a year ago. Microsoft’s Intelligent Cloud revenue reached $24.1 billion, a jump of 20%. Meanwhile, Google Cloud’s revenues jumped 35%, reaching $11.4 billion.

AWS continues to lead in market share with 31%, followed by Azure at 20% and Google Cloud at 12%.

At the Core of Generative AI Growth

While holding an earnings call, Microsoft CEO Satya Nadella revealed the company’s AI business is on track to surpass an annual revenue run rate of $10 billion in the next quarter. According to him, Microsoft is the fastest company in history to reach this milestone.

Nadella shared that Microsoft is experiencing an increase in the number of customers utilising Azure AI services to build their own co-pilots and agents. “Azure OpenAI usage more than doubled over the past six months as both digital natives like Grammarly and Harvey, as well as established enterprises like Bajaj Finance, Hitachi, KT, and LG, move their apps from testing to production,” he said.

Citing the example of aerospace company GE Aerospace, Nadella mentioned that the company has used Azure OpenAI to build a new digital assistant for its 52,000 employees. “In just three months, it has been used to conduct over 5,00,000 internal queries and process more than 2,00,000 documents,” said Nadella.


Microsoft was the first hyperscaler to deploy NVIDIA’s Blackwell system with GB200-powered AI servers. 

Furthermore, in terms of LLMs, Microsoft has expanded its capabilities by adding support for OpenAI’s latest model family, o1. “We’re also bringing industry-specific models through Azure AI, including a collection of best-in-class multimodal models for medical imaging,” said Nadella.

Having said that, Microsoft is expecting a $1.5 billion loss from its investment in OpenAI. 

On the other hand, AWS is confident about its generative AI business. “AWS’s AI business has a multibillion-dollar revenue run rate that continues to grow at a triple-digit year-over-year percentage,” said AWS chief Andy Jassy, adding that it is currently growing more than three times faster than AWS did during its evolution stage.

Jassy disclosed that Amazon will spend about $75 billion in 2024, primarily on infrastructure, data centres, and other resources essential for running AWS. “We’ll spend more than that in 2025, with the majority allocated to AWS,” he said while mentioning that the increases are primarily driven by generative AI.

Similar to how Microsoft is betting on OpenAI, AWS is working closely with Anthropic. The company recently added Claude 3.5 Sonnet to Amazon Bedrock, alongside Meta’s Llama 3.2 models, Mistral’s Large 2 models, and multiple Stability AI models.

Jassy also shared that customers love Amazon Q. “We’re continuing to see strong adoption of Amazon Q, the most capable generative AI-powered assistant for software development and for leveraging your own data. Q has the highest reported code acceptance rates in the industry for multiline code suggestions,” he added.

Notably, Amazon recently added an inline chat feature to Amazon Q, powered by Claude 3.5 Sonnet.

At the same time, Google chief Sundar Pichai revealed that Google Gemini API calls have increased 14x times in the past six months. 

Google Cloud offers the Vertex AI platform, which features a comprehensive suite of MLOps tools for using, deploying, and monitoring AI models. This platform also offers the Gemini API.

Lowering Cloud Costs

One common goal among all three cloud service providers is to reduce cloud costs for their customers. Jassy said as customers begin to scale their implementations on the inference side, they quickly realise that it can become costly. 

To lower these costs, AWS is developing Trainium and Inferentia, its own custom silicon ships. “The second version of Trainium, Trainium2, will start to ramp up in the next few weeks, and I think it’s going to be very compelling for customers on a price-performance basis,” Jassy further said.

Similarly, Google is currently developing Trillium, its sixth generation of Tensor Processing Unit (TPU). “Using a combination of our TPUs and GPUs, LG AI Research reduced inference processing time for its multimodal model by more than 50% and operating costs by 72%,” said Pichai.

Google CFO Anat Ashkenazi revealed that the company invested $13 billion in capital expenditures (CapEx) during the latest quarter. She disclosed that 60% of that investment in technical infrastructure went towards servers and about 40% towards data centres and networking equipment.

Likewise, Microsoft is building Maia 100, an AI accelerator specifically created for large-scale AI workloads deployed in Azure. 

Moreover, Nadella said that Microsoft is not in the business of selling raw GPUs for others to use to train their models. Instead, he highlighted the rapid growth in AI-related revenue driven by inference. 

He expressed confidence in the quality of Microsoft’s revenue, as it comes from established enterprise needs rather than model training. Nadella also specified that $10 billion of this revenue is coming from inference.

Additionally, Microsoft CFO Amy Hood said that revenue growth from inference and applications helps fund further training investments. She stressed that training is not a separate phase but part of a continuous cycle of investment, growth, and development in AI for the company.  

Microsoft, AWS, and Google’s competition will continue to spiral. The conversation now shifts to agents from LLMs, which will require even more computing.

]]>
AWS Names Accenture Veteran Sandeep Dutta as India and South Asia President https://analyticsindiamag.com/ai-news-updates/aws-names-accenture-veteran-sandeep-dutta-as-india-and-south-asia-president/ Mon, 04 Nov 2024 13:09:31 +0000 https://analyticsindiamag.com/?p=10140129 AWS’s existing cloud infrastructure has contributed more than INR 30,900 crore ($3.7 billion) to India’s GDP, supporting over 39,500 jobs.]]>

Amazon Web Services (AWS) has appointed Sandeep Dutta as President of AWS India and South Asia. Dutta, who brings over two decades of experience from Accenture, will lead AWS’s growth and operations across India, supporting businesses, governments, and communities on their cloud journey.

The announcement was made by Jaime Valles, vice president at AWS, in a LinkedIn post. “Sandeep’s arrival marks an important milestone in our journey to empower businesses, governments, and communities throughout the subcontinent to realise their full digital potential,” said Valles.

Valles said that Dutta understands the challenges and opportunities facing organisations in India. “He’s recognised as a thought leader on disruptive technologies, and his strategic vision has been pivotal in helping customers across industries reimagine what’s possible,” he said.

“Alongside the rest of our AWS APJ team, we’ll work together to continue to help our customers harness the power of the cloud to enable cultural transformation and digital innovation, deliver value, and solve India’s hardest problems and biggest ambitions,” he added. 

AWS has laid out ambitious plans to bolster India’s digital future, committing to a cumulative investment of INR 1,05,600 crore (USD $12.7 billion) by 2030. This initiative is expected to contribute INR 1,94,700 crore (USD $23.3 billion) to India’s GDP and create over 1,31,700 full-time equivalent jobs annually. 

As of now, AWS’s existing cloud infrastructure has contributed more than INR 30,900 crore (USD $3.7 billion) to India’s GDP, supporting over 39,500 jobs.

In addition to its economic investments, AWS has also committed to fostering a sustainable digital future for India. The company is backing 50 renewable energy projects across the country. These initiatives will collectively provide over 1.1 gigawatts of clean energy, reducing AWS’s carbon footprint in India and supporting the country’s green energy goals. 

]]>
Generative AI Cost Optimisation Strategies https://analyticsindiamag.com/ai-highlights/generative-ai-cost-optimisation-strategies/ Thu, 03 Oct 2024 07:21:49 +0000 https://analyticsindiamag.com/?p=10137322 As an executive exploring generative AI’s potential for your organisation, you’re likely concerned about costs. Implementing AI isn’t just about picking a model and letting it run. It’s a complex ecosystem of decisions, each affecting the final price tag. This article will guide you to optimise costs throughout the AI life cycle, from model selection […]]]>

As an executive exploring generative AI’s potential for your organisation, you’re likely concerned about costs. Implementing AI isn’t just about picking a model and letting it run. It’s a complex ecosystem of decisions, each affecting the final price tag. This article will guide you to optimise costs throughout the AI life cycle, from model selection and fine-tuning to data management and operations.

Model Selection

Wouldn’t it be great to have a lightning-fast, highly accurate AI model that costs pennies to run? Since this ideal scenario does not exist (yet), you must find the optimal model for each use case by balancing performance, accuracy, and cost.

Start by clearly defining your use case and its requirements. These questions will guide your model selection:

  • Who is the user?
  • What is the task?
  • What level of accuracy do you need?
  • How critical is rapid response time to the user?
  • What input types will your model need to handle, and what output types are expected?

Next, experiment with different model sizes and types. Smaller, more specialised models may lack the broad knowledge base of their larger counterparts, but they can be highly effective—and more economical—for specific tasks.

Consider a multi-model approach for complex use cases. Not all tasks in a use case may require the same level of model complexity. Use different models for different steps to improve performance while reducing costs.

Fine-Tuning and Model Customisation

Pretrained foundation models (FMs) are publicly available and can be used by any company, including your competitors. While powerful, they lack the specific knowledge and context of your business.

To gain a competitive advantage, you need to infuse these generic models with your organisation’s unique knowledge and data. Doing so transforms an FM into a powerful, customised tool that understands your industry, speaks your company’s language, and leverages your proprietary information. Your choice to use retrieval-augmented generation (RAG), fine-tuning, or prompt engineering for this customisation will affect your costs.

Retrieval-Augmented Generation

RAG pulls data from your organisation’s data sources to enrich user prompts so they deliver more relevant and accurate responses. Imagine your AI being able to instantly reference your product catalogue or company policies as it generates responses. RAG improves accuracy and relevance without extensive model retraining, balancing performance and cost efficiency.

Fine-Tuning

Fine-tuning means training an FM on additional, specialised data from your organisation. It requires significant computational resources, machine learning expertise, and carefully prepared data, making it more expensive to implement and maintain than RAG.

Fine-tuning excels when you need the model to perform exceptionally well on specific tasks, consistently produce outputs in a particular format, or perform complex operations beyond simple information retrieval.

We recommend a phased approach. Start with less resource-intensive methods such as RAG and consider fine-tuning only when these methods fail to meet your needs. Set clear performance benchmarks and regularly evaluate the gains versus the resources invested.

Prompt Engineering

Prompts are the instructions given to AI applications. AI users, such as designers, marketers, or software developers, enter prompts to generate the desired output, such as pictures, text summaries or source code. Prompt engineering is the practice of crafting and refining these instructions to get the best possible results. Think of it as asking the right questions to get the best answers.

Good prompts can significantly reduce costs. Clear, specific instructions reduce the need for multiple back-and-forth interactions that can quickly add up in pay-per-query pricing models. They also lead to more accurate responses, reducing the need for costly, time-consuming human review. With prompts that provide more context and guidance, you can often use smaller, more cost-effective AI models.

Data Management

The data you use to customise generic FMs is also a significant cost driver. Many organisations fall into the trap of thinking that more data always leads to better AI performance. In reality, a smaller dataset of high-quality, relevant data often outperforms larger, noisier datasets.

Investing in robust data cleansing and curation processes can reduce the complexity and cost of customising and maintaining AI models. Clean, well-organised data allows for more efficient fine-tuning and produces more accurate results from techniques like RAG. It lets you streamline the customisation process, improve model performance, and ultimately lower the ongoing costs of your AI implementations.

Strong data governance practices can help increase the accuracy and cost performance of your customised FM. It should include proper data organisation, versioning, and lineage tracking. On the other hand, inconsistently labelled, outdated, or duplicate data can cause your AI to produce inaccurate or inconsistent results, slowing performance and increasing operational costs. Good governance helps ensure regulatory compliance, preventing costly legal issues down the road.

Operations

Controlling AI costs isn’t just about technology and data—it’s about how your organisation operates.

Organisational Culture and Practices

Foster a culture of cost-consciousness and frugality around AI, and train your employees in cost-optimisation techniques. Share case studies of successful cost-saving initiatives and reward innovative ideas that lead to significant cost savings. Most importantly, encourage a prove-the-value approach for AI initiatives. Regularly communicate the financial impact of AI to stakeholders.

Continuous learning about AI developments helps your team identify new cost-saving opportunities. Encourage your team to test various AI models or data preprocessing techniques to find the most cost-effective solutions.

FinOps for AI

FinOps, short for financial operations, is a practice that brings financial accountability to the variable spend model of cloud computing. It can help your organisation efficiently use and manage resources for training, customising, fine-tuning, and running your AI models. (Resources include cloud computing power, data storage, API calls, and specialised hardware like GPUs). FinOps helps you forecast costs more accurately, make data-driven decisions about AI spending, and optimise resource usage across the AI life cycle.

FinOps balances a centralised organisational and technical platform that applies the core FinOps principles of visibility, optimisation, and governance with responsible and capable decentralised teams. Each team should “own” its AI costs—making informed decisions about model selection, continuously optimising AI processes for cost efficiency, and justifying AI spending based on business value.

A centralised AI platform team supports these decentralised efforts with a set of FinOps tools and practices that includes dashboards for real-time cost tracking and allocation, enabling teams to closely monitor their AI spending. Anomaly detection allows you to quickly identify and address unexpected cost spikes. Benchmarking tools facilitate efficiency comparisons across teams and use cases, encouraging healthy competition and knowledge sharing.

Conclusion

As more use cases emerge and AI becomes ubiquitous across business functions, organisations will be challenged to scale their AI initiatives cost-effectively. They can lay the groundwork for long-term success by establishing robust cost optimisation techniques that allow them to innovate freely while ensuring sustainable growth. After all, success depends on perfecting the delicate balance between experimentation, performance, accuracy, and cost.

]]>
PSB Alliance Empanels AWS to Provide Cloud Computing Services to Public Sector Banks https://analyticsindiamag.com/ai-news-updates/psb-alliance-empanels-aws-to-provide-cloud-computing-services-to-public-sector-banks/ Wed, 25 Sep 2024 09:48:54 +0000 https://analyticsindiamag.com/?p=10136617 The empanelment will allow AWS to provide its services to member banks through two AWS Managed Service Providers (MSP) – Orient Technologies and Hitachi Systems.]]>

Amazon Web Services (AWS) India Private Limited has announced that PSB Alliance Private Limited (PSBA) has empanelled AWS as a cloud service provider to offer cloud computing services to India’s public sector banks.

PSBA is an umbrella organisation formed by 12 public sector banks and acts as a nodal body to deliver end-to-end technology-enabled banking services to them. It supports the Government of India’s financial reforms and Enhanced Access and Service Excellence (EASE) agenda, and aims to foster innovation in the financial ecosystem.

This empanelment will enable the public sector banks to seamlessly adopt AWS cloud services through PSBA’s Community Cloud Services, without a need to create separate procurement processes for their cloud computing requirements.

The empanelment will allow AWS to provide its services to member banks through two AWS Managed Service Providers (MSP) – Orient Technologies and Hitachi Systems.

Moreover, the banks will be able to migrate to the cloud and adopt a more agile technology infrastructure in line with their business goals and digital transformation strategy, and leverage technology expertise from the MSPs.

In addition, PSBA will provide banks access to software-as-a-service (SaaS) based financial service marketplaces, enabling them to offer services such as doorstep banking, and subscribe to eBKray, an innovative end-to-end listing and auction platform to manage Non-Performing Asset (NPA) loans.

“Over the next few years, we expect several public sector banks to shift their non-core banking workloads to the cloud. Our endeavour to empanel AWS allows public sector banks to draw on the experience of their peers, establish and reinforce industry best practices, and take advantage of the combined scale of operations. Our member banks can secure better pricing structures and enhanced value from cloud services,” said Eric Anklesaria, Senior Advisor, PSBA.

As next steps, the MSPs will conduct a cloud migration assessment and readiness study for each of the public sector banks and help them build a migration roadmap.

Initially, the banks will use cloud services for non-core banking applications, such as loan origination system, loan management system, cash management system, WhatsApp banking, IVR system (contact center), supply chain finance, customer relationship management, human resource applications, development and user acceptance test environments, data lake and data analytics, and microfinance credit management.

]]>
AWS Selects Seven Indian Startups for Global Generative AI Accelerator Program https://analyticsindiamag.com/ai-news-updates/aws-selects-seven-indian-startups-for-global-generative-ai-accelerator-program/ Fri, 13 Sep 2024 05:28:18 +0000 https://analyticsindiamag.com/?p=10135163 Amazon Web Services (AWS) has selected seven generative AI startups from India to join its prestigious Global Generative AI Accelerator program. India has the highest representation in the Asia-Pacific and Japan region, showcasing the country’s growing influence as an emerging AI hotspot. The chosen startups—Convrse, House of Models, Neural Garage, Orbo.ai, Phot.ai, Unscript AI, and […]]]>

Amazon Web Services (AWS) has selected seven generative AI startups from India to join its prestigious Global Generative AI Accelerator program. India has the highest representation in the Asia-Pacific and Japan region, showcasing the country’s growing influence as an emerging AI hotspot. The chosen startups—Convrse, House of Models, Neural Garage, Orbo.ai, Phot.ai, Unscript AI, and Zocket—are part of 80 companies selected globally for their innovative use of AI and global growth potential.

Each of these startups will receive up to US$1 million in AWS credits, mentorship from global experts, and technical support to accelerate their AI development. They will also gain the opportunity to showcase their solutions at AWS’s reevent in Las Vegas later this year.

This selection is part of AWS’s US$230 million commitment to advancing generative AI technologies across the globe. Startups in the accelerator will benefit from access to AWS’s compute, storage, and database solutions, as well as energy-efficient AI chips like AWS Trainium and Inferentia2. 

The startups will also be able to leverage Amazon SageMaker for building and training foundation models, and Amazon Bedrock to develop generative AI applications securely.

Amitabh Nagpal, Head of Startup Business Development at AWS India, highlighted the potential of the selected Indian startups. “We are thrilled to announce the seven Indian startups that have been selected. These companies are driving innovation in generative AI and will help solve complex challenges in India and beyond,” he said.

The accelerator program lasts for 10 weeks and pairs startups with business and technical mentors from AWS and presenting partner NVIDIA, a leader in accelerated computing. The goal is to empower startups with the resources they need to develop, train, test, and scale their AI solutions effectively.

India’s AI sector continues to expand, with over US$82 billion invested in AI and ML startups in 2024, according to PitchBook Data. AWS is committed to fostering this growth through initiatives like the AWS GenAI Loft, a pop-up space in Bangalore designed to promote generative AI innovation.

Selected Startups from India for the AWS Global Generative AI Accelerator:

  • Convrse (Haryana): AI tool optimizing 3D objects for web or real-time use.
  • House of Models (Karnataka): Generates digital content using LLM and Diffuser models.
  • Neural Garage (Karnataka): Uses AI to sync actors’ lips with dubbed audio seamlessly.
  • Orbo.ai (Maharashtra): Provides AI-driven hyper-personalization in the beauty industry.
  • Phot.ai (Haryana): AI platform for photo editing and graphic design, targeting e-commerce sellers.
  • Unscript AI (Karnataka): Creates studio-quality videos with virtual or real actors.
  • Zocket (Tamil Nadu): Simplifies digital ad creation and targeting with AI.

AWS’s accelerator is expected to play a crucial role in shaping the future of AI startups, providing them with the tools and infrastructure needed to fulfill their ambitions on a global scale.

]]>
AWS Data Centres Cut Carbon Emissions by 98% for AI Workloads Compared to On-Premises Solutions: Study https://analyticsindiamag.com/ai-news-updates/aws-data-centres-cut-carbon-emissions-by-98-for-ai-workloads-compared-to-on-premises-solutions-study/ https://analyticsindiamag.com/ai-news-updates/aws-data-centres-cut-carbon-emissions-by-98-for-ai-workloads-compared-to-on-premises-solutions-study/#respond Mon, 05 Aug 2024 06:30:02 +0000 https://analyticsindiamag.com/?p=10131308 Accenture estimates that AWS’s global infrastructure is up to 4.1 times more efficient than on-premises. ]]>

A new study commissioned by Amazon Web Services (AWS) and completed by Accenture found that utilizing AWS data centres for compute-heavy, or AI workloads yields a 98% reduction in carbon emissions compared to on-premises data centres.

The study found that an effective way to minimise the environmental footprint of leveraging Artificial Intelligence is by moving IT workloads from on-premises infrastructure to cloud data centres in India and around the globe. 

Accenture estimates that AWS’s global infrastructure is up to 4.1 times more efficient than on-premises. For Indian organisations, the total potential carbon reduction opportunity for AI workloads optimised on AWS is up to 99% compared to on-premises data centres.

This is credited to AWS’s utilisation of more efficient hardware (32%), improvements in power and cooling efficiency (35%), and additional carbon-free energy procurement (31%).

Further optimising on AWS by leveraging purpose-built silicon can increase the total carbon reduction potential of AI workloads to up to 99% for Indian organisations that migrate to and optimise on AWS.

“Considering 85% of global IT spend by organisations remains on-premises, a carbon reduction of up to 99% for AI workloads optimised on AWS in India is a meaningful sustainability opportunity for Indian organisations, according to Jenna Leiner, Head of Environment Social Governance (ESG) and External Engagement, AWS Global.

“As India accelerates towards its US$1 trillion-dollar digital opportunity and encourages investments into digital infrastructure, sustainability innovations and minimising IT related carbon emissions will be critical in also helping India meet its net-zero emissions by 2070 goal.

This is particularly important given the rising adoption of AI. AWS is constantly innovating for sustainability across our data centres —optimising our data centre design, investing in purpose-built chips, and innovating with new cooling technologies – so that we continuously increase energy efficiency to serve customer compute demands,” Leiner said.

]]>
https://analyticsindiamag.com/ai-news-updates/aws-data-centres-cut-carbon-emissions-by-98-for-ai-workloads-compared-to-on-premises-solutions-study/feed/ 0
AWS, Google and Other Cloud Giants Go After AI Agents https://analyticsindiamag.com/ai-trends/aws-google-and-other-cloud-giants-go-after-ai-agents/ https://analyticsindiamag.com/ai-trends/aws-google-and-other-cloud-giants-go-after-ai-agents/#respond Sat, 13 Jul 2024 05:31:06 +0000 https://analyticsindiamag.com/?p=10126748 AWS specified that the agent’s code interpretation capabilities are used only when the LLM deems them necessary, making them semi-autonomous.]]>

At the AWS New York Summit this week, the cloud provider announced that AI agents built through Amazon Bedrock would have enhanced memory and code interpretation capabilities. AWS’ AI and data vice president, Swami Sivasubramanian, said that this was part of a larger update to AWS’ overall GenAI stack available to their enterprise customers.

“At the top layer, which includes generative AI-powered applications, we have Amazon Q, the most capable generative AI-powered assistant. The middle layer has Amazon Bedrock, which provides tools to easily and rapidly build, deploy, and scale generative AI applications leveraging LLMs and other foundation models (FMs).

“And at the bottom, there’s our resilient, cost-effective infrastructure layer, which includes chips purpose-built for AI, as well as Amazon SageMaker to build and run FMs,” he said.

Now, agents built using Bedrock with improved capabilities are interesting, as they would be able to carry out multistep, complex tasks like automating the processing of insurance claims or booking flights for the business with prior knowledge of preferences.

As mentioned before, these agents now also have code interpretation abilities, which means they can generate and run code when the LLM deems it necessary, “significantly expanding the use cases they can address, including complex tasks such as data analysis, visualisation, text processing, equation solving, and optimisation problems”, the company said on the update. 

Despite these updates, AWS still seems to be slightly behind, as Azure also announced similar capabilities for enterprise AI agent building in April and GCP prior to that, though memory retention is not as seamless for agents built on GCP’s Vertex AI.

However, these rapid updates coming from the top three biggest cloud providers in the industry mean one thing: the next wave in the AI revolution is already underway.

Cloud > Generative AI > Building Agents

After the initial panic for businesses to move towards cloud-based systems to avoid going under, companies have quickly grown wise to how these systems can be leveraged to get the most out of their data.

Long story short, cloud providers identified these needs, deploying all-encompassing generative AI capabilities for their customers. As Sivasubramanian said, “They need a full range of capabilities to build and scale generative AI applications tailored to their business and use case.”

Now, the shift towards focusing on building AI agents and improving their overall capabilities signifies a larger need to seamlessly connect all of these services under an easy-to-use interface for employees.

The entire point of deploying generative AI for enterprises is to ease the order of operations within a business. The focus on agents is particularly important as companies rely on the ability to customise and finetune their AI to fit their specific and industry-relevant needs especially as agents have the malleability to execute varied tasks depending on the ask.

During Google Cloud Next 2024, CEO Thomas Kurien said, “Many customers and partners are building increasingly sophisticated AI agents.” With AI agents becoming all the rage, improving their capabilities has become a priority, which explains the slew of updates to agent-building capabilities in the last year alone.

What Can They Improve On?

These updates signify pretty exciting possibilities for what AI agents can do in the future. As is already the case, AI agents are a step towards achieving AGI. Whether that be in the near future or years away, agents still manage to reflect the best in terms of AI innovations. 

With AWS’ recent announcement, they specify that the agent’s code interpretation capabilities are only used when the LLM deems it necessary. Though this limits how these capabilities are used, as it’s not up to the user, it also marks a form of semi-autonomy.

However, fully autonomous AI agents are far from close. “I think it’s going to require not just one but two orders of magnitude more computation to train the models,” said Microsoft AI chief Mustafa Suleyman.

Nevertheless, the enterprise focus on seamless operations and better customer experiences means that agent capabilities will continue to expand, potentially allowing them to act and execute tasks autonomously to produce relevant and digestible outputs for the company’s employees.

As Sivasubramanian has said of AWS, “We’re energised by the progress our customers have already made in making generative AI a reality for their organisations and will continue to innovate on their behalf.”

This seems to be the sentiment across the board for both GenAI and cloud providers, as many industry stalwarts, including Andrew Ng, Andrej Karpathy and Vinod Khosla, have voiced a need for more education around and funding in agent research.

]]>
https://analyticsindiamag.com/ai-trends/aws-google-and-other-cloud-giants-go-after-ai-agents/feed/ 0