Welcome!

Machine Learning Authors: Yeshim Deniz, Pat Romanski, Elizabeth White, AppNeta Blog, Automic Blog

Related Topics: @BigDataExpo, Cognitive Computing , Machine Learning

@BigDataExpo: Article

Patent Data Quality | @CloudExpo #BigData #Analytics #AI #MachineLearning

Is clean data a pipe dream?

The United States Patent and Trademark Office (USPTO) recently announced an expansion of PatentsView, its visualization tool for US patents. First launched a few years ago, the intent behind the tool was to make 40 years of patent filing data available for free to those interested in examining "the dynamics of inventor patenting activity over time." In spite of being limited to patents (not applications) and with a focus only on the US, it offers some interesting visualizations around locations and citations.

In a blog post last month, USPTO director Michelle Lee said the PatentView tool is based on "the highest-quality patent data available," connecting 40 years' worth of information about inventors, their organizations, and their locations in unprecedented ways. The newly revamped interface presents three user-friendly starting points - relationship, locations, and comparison visualizations - which allow for deeper exploration and detailed views. However, through no fault of their own, the USPTO dataset is rife with spelling errors, doesn't reflect patent reassignments, and doesn't resolve company subsidiaries or acquisitions.

This issue is not unique to the USPTO. Other PTO offices around the world face similar barriers to presenting "clean" data. The first issue, spelling errors, merely reflects the fact that assignee information (among other fields like inventor names) is manually entered and hence prone to error and inconsistency. For example, "International Business Machines" has been spelled 1,200 different ways as a patent assignee over the last two decades in the USPTO data set.

In addition, PTO data doesn't get corrected or updated based on later corrections or patent reassignments. For example, patent US8176440 was originally - and incorrectly - assigned to Silicon Labs. My company, Innography, filed a certificate of correction to update the assignment, yet the USPTO data and PatentsView still don't reflect this. In fact, Innography research shows that nearly 20 percent of US patents are reassigned in their lifetimes, translating into a significant number of company portfolio errors based on this factor alone.

Finally, PTO data also doesn't reflect when companies purchase each other, when there's a spinoff, or when a subsidiary files patents. Microsoft, for example, now owns all LinkedIn's patents, even if the reassignments haven't been processed.

As a result, PTO data falls far short of reflecting reality, where patents and companies are bought and sold every day, and where data-entry errors exist and are corrected. The accuracy of the data is very low when it comes to representing company patent portfolios in the real world.

The Cost of Free Data
The USPTO aims to increase the transparency of patenting and invention processes. But if the quality of data and search results is questionable, what good is it to IP practitioners?

There is rich information available through the patenting process, including economic research, prior-art searching, and discovery of broader trends around filing patterns. However, it was never intended to be used as-is to inform strategic business decisions such as in and out licensing, merger and acquisition activities, or portfolio pruning and maintenance decisions.

It makes sense for PTOs to offer their data for free as a way to engage the community's interest in patenting processes. However, too many lightweight patent analytics tools use this flawed data verbatim to tout their "data quality" to IP professionals.

Many patent analyses start with a company's patent portfolio, such as competitive benchmarking, acquisition analysis, and negotiation preparation. In addition, just about every board-level question about patents requires accurate patent ownership information: "Are we ahead of or behind this competitor?" "What companies should we be worried about in this technology area?"

Poor data quality makes it difficult, if not impossible, to answer those questions accurately. To create the most accurate data set possible, companies must use other sources of information to crosscheck and improve patent data accuracy.

Innography data scientists process more than 2,000 company acquisitions annually, and our user base suggests another 5,000 updates each year. As a result, Innography has created more than 10 million data-correction rules over the last decade, which are continuously updated via machine learning and crowdsourcing.

Company leaders must be able to use patent reports to assess market opportunities and make strategic business decisions. This requires an IP analytics solution that reflects real-world changes, and doesn't rely on poor data quality from outdated PTO assignee information.

More Stories By Tyron Stading

Tyron Stading is president and founder of Innography, and chief data officer for CPA Global. He has been named one of the “World’s Leading IP Strategists" by IAM, and one of National Law Journal's "50 Intellectual Property Trailblazers & Pioneers". Before Innography, Tyron was an IBM worldwide industry solutions manager in the telecommunications and utilities sector, and worked at several start-ups focused on mobile communications and networks security. He has published multiple research papers and filed more than three dozen patents. Tyron has a BS in Computer Science from Stanford University and an MS in Technology Commercialization from The University of Texas.

@CloudExpo Stories
"delaPlex is a software development company. We do team-based outsourcing development," explained Mark Rivers, COO and Co-founder of delaPlex Software, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
SYS-CON Events announced today that Technologic Systems Inc., an embedded systems solutions company, will exhibit at SYS-CON's @ThingsExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Technologic Systems is an embedded systems company with headquarters in Fountain Hills, Arizona. They have been in business for 32 years, helping more than 8,000 OEM customers and building over a hundred COTS products that have never been discontinued. Technologic Systems’ pr...
SYS-CON Events announced today that CA Technologies has been named “Platinum Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY, and the 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CA Technologies helps customers succeed in a future where every business – from apparel to energy – is being rewritten by software. From ...
In his keynote at @ThingsExpo, Chris Matthieu, Director of IoT Engineering at Citrix and co-founder and CTO of Octoblu, focused on building an IoT platform and company. He provided a behind-the-scenes look at Octoblu’s platform, business, and pivots along the way (including the Citrix acquisition of Octoblu).
SYS-CON Events announced today that Auditwerx will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Auditwerx specializes in SOC 1, SOC 2, and SOC 3 attestation services throughout the U.S. and Canada. As a division of Carr, Riggs & Ingram (CRI), one of the top 20 largest CPA firms nationally, you can expect the resources, skills, and experience of a much larger firm combined with the accessibility and atten...
What if you could build a web application that could support true web-scale traffic without having to ever provision or manage a single server? Sounds magical, and it is! In his session at 20th Cloud Expo, Chris Munns, Senior Developer Advocate for Serverless Applications at Amazon Web Services, will show how to build a serverless website that scales automatically using services like AWS Lambda, Amazon API Gateway, and Amazon S3. We will review several frameworks that can help you build serverle...
SYS-CON Events announced today that Loom Systems will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 2015, Loom Systems delivers an advanced AI solution to predict and prevent problems in the digital business. Loom stands alone in the industry as an AI analysis platform requiring no prior math knowledge from operators, leveraging the existing staff to succeed in the digital era. With offices in S...
SYS-CON Events announced today that HTBase will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. HTBase (Gartner 2016 Cool Vendor) delivers a Composable IT infrastructure solution architected for agility and increased efficiency. It turns compute, storage, and fabric into fluid pools of resources that are easily composed and re-composed to meet each application’s needs. With HTBase, companies can quickly prov...
SYS-CON Events announced today that T-Mobile will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on ...
SYS-CON Events announced today that Cloud Academy will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloud Academy is the industry’s most innovative, vendor-neutral cloud technology training platform. Cloud Academy provides continuous learning solutions for individuals and enterprise teams for Amazon Web Services, Microsoft Azure, Google Cloud Platform, and the most popular cloud computing technologies. Ge...
MongoDB Atlas leverages VPC peering for AWS, a service that allows multiple VPC networks to interact. This includes VPCs that belong to other AWS account holders. By performing cross account VPC peering, users ensure networks that host and communicate their data are secure. In his session at 20th Cloud Expo, Jay Gordon, a Developer Advocate at MongoDB, will explain how to properly architect your VPC using existing AWS tools and then peer with your MongoDB Atlas cluster. He'll discuss the secur...
SYS-CON Events announced today that CrowdReviews.com has been named “Media Sponsor” of SYS-CON's 20th International Cloud Expo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. CrowdReviews.com is a transparent online platform for determining which products and services are the best based on the opinion of the crowd. The crowd consists of Internet users that have experienced products and services first-hand and have an interest in letting other potential buyers...
Imagine having the ability to leverage all of your current technology and to be able to compose it into one resource pool. Now imagine, as your business grows, not having to deploy a complete new appliance to scale your infrastructure. Also imagine a true multi-cloud capability that allows live migration without any modification between cloud environments regardless of whether that cloud is your private cloud or your public AWS, Azure or Google instance. Now think of a world that is not locked i...
SYS-CON Events announced today that Infranics will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Since 2000, Infranics has developed SysMaster Suite, which is required for the stable and efficient management of ICT infrastructure. The ICT management solution developed and provided by Infranics continues to add intelligence to the ICT infrastructure through the IMC (Infra Management Cycle) based on mathemat...
In his session at Cloud Expo, Alan Winters, an entertainment executive/TV producer turned serial entrepreneur, will present a success story of an entrepreneur who has both suffered through and benefited from offshore development across multiple businesses: The smart choice, or how to select the right offshore development partner Warning signs, or how to minimize chances of making the wrong choice Collaboration, or how to establish the most effective work processes Budget control, or how to m...
SYS-CON Events announced today that Juniper Networks (NYSE: JNPR), an industry leader in automated, scalable and secure networks, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Juniper Networks challenges the status quo with products, solutions and services that transform the economics of networking. The company co-innovates with customers and partners to deliver automated, scalable and secure network...
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 20th Cloud Expo, which will take place on June 6-8, 2017 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 add...
SYS-CON Events announced today that Cloudistics, an on-premises cloud computing company, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloudistics delivers a complete public cloud experience with composable on-premises infrastructures to medium and large enterprises. Its software-defined technology natively converges network, storage, compute, virtualization, and management into a ...
SYS-CON Events announced today that SD Times | BZ Media has been named “Media Sponsor” of SYS-CON's 20th International Cloud Expo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. BZ Media LLC is a high-tech media company that produces technical conferences and expositions, and publishes a magazine, newsletters and websites in the software development, SharePoint, mobile development and commercial UAV markets.