Incredible milestone for Upstart #proudseedinvestor

Today Upstart announced that it had received the FIRST “no action letter” from the CFPB.  Details are in Dave’s post here.  This is an incredible moment for consumer lending as the future of the industry hinges on the responsible use of AI/ML to reduce costs and increase availability of credit for the benefit of consumers.  Upstart is paving the way, showing consumers that there is a much better and more cost effective way to borrow/lent.  Truly exceptional moment for a great company, fantastic people and world class founding team.  #proudseedinvestor

Posted in Big Data, Entrepreneurship, Innovation, Start-Ups, Uncategorized | Leave a comment


Reposted from Dataconomy

Rich Miner – Android co-founder, formerly Google Ventures, now Google — asked me recently “What if companies managed their data like they manage their money?”

It’s a basic but profound question that merits some thoughts based on my 25 years managing both information and financial functions in technology and data businesses.

The surface-level analogy here, data is to money, is too apparent to linger on for long. In the information age, data has massive tangible value — especially now that businesses are applying machine learning to data in analytics and applications to accelerate cost savings and revenue growth.

The question gets more nuanced and interesting in unpacking the analogy to compare how data and money are handled as strategic assets within a business. By definition, businesses manage money, along with employees, as their most strategic asset — not just as cash flow (savings and revenue) on the P&L, but as growth-inducing capital on the balance sheet (M&A). And the tools for managing financial assets have naturally followed. Fortunately for them, Chief Financial Officers (CFOs) have a host of well-defined mechanisms to manage their company’s financial assets — frameworks, systems and tools that help them understand where the money comes from, where it goesand how much they currently have on hand. To this they’re adding ever more advanced predictive financial analytics as a way of guiding their decisions.

Chief Information Officers (CIOs) and newly created executive positions of Chief Data Officers (CDOs) and Chief Analytic Officers (CAOs), are not that fortunate. For many good historical reasons, they lack a clear understanding of where information comes from, where it goes and what their current data inventory looks like. While their CFO counterparts can lean on common financial management frameworks dating back to the year 1340 and the advent of general ledger accounting, CIOs, CDOs and CAOs lack these sorts of common frameworks for managing data as a true asset. As a result, they find it difficult to answer some of the most basic questions: What information attributes do you have, how many records are there in those attributes, and who is using those attributes and records in their analyses?

This challenge, however, goes deeper than frameworks, systems and even tools — down to a fundamental conceptual flaw that the Fortune 1000 is still paying the price for. For decades, large businesses managed their data as “exhaust” (a byproduct of their systems, applications and transactions) to be contained, rather than as fuel for growth. The best-led companies — the ones with the highest Corporate IQ — have made the mental switch. Companies like GE, which to quote a recent Forbes piece by Randy Bean and Tom Davenport, “is leveraging AI and machine learning fueled by the power of Big Data” to accomplish its “digital transformation.” Or like CapitalOne, which has always used data as a strategic asset to drive its growth. Or younger companies like Upstart, which views itself less as a loan processing business and more as a data operation that uses the asset to make better loans.

These companies, and many more enlightened ones like them, see the transformational power of data and analytics, use it as a strategic weapon and — importantly — commit not only to the technology required to transform, but the behavioral and organizational change needed to operationalize it.

One more common thread now links these companies: those most successful at digital transformation have CDOs, CAOs and CIOs behaving aggressively like CFOs. Bill Ruh, GE’s Chief Digital Officer and CEO of GE Digital, is a great example. Beyond leading the company’s drive on the Internet of Things, he has overseen GE’s integration of data within its business operations. This includes its collaboration with Tamr to integrate GE’s global supplier data, which in the Forbes article Bill refers to as:

… a big win. It’s easy for suppliers to charge different prices for the same product when you can’t compare them across business units. We might spend $250 million a year on nuts and bolts, but that only becomes salient when you look across business units and see if they’re coming from the same suppliers. If they are, you are in a much better position to negotiate.”

Bill’s quote may very well be the answer to Rich Miner’s question: What if companies managed their data as carefully as they manage their money? They would, like GE did with Tamr on the supplier integration side:

  • Optimize Sourcing Strategies for ~$50 billion in material spend across business units
  • Renegotiate Contract Terms to identify $80 million in savings
  • Reduce Total Landed Cost of Products by unifying/cleaning tens of millions of shipping records

In other words, they would manage data as the most strategic of assets — committing as much C-level attention, as much analytical firepower and as much ROI-based measurement to data as they do to money.

Posted in Uncategorized | Leave a comment

Podcast on founding of Recorded Future with Christopher Ahlberg

Christopher Ahlberg and I spent some time reflecting on the founding of Recorded Future.  Podcast is here.

If you haven’t heard of Recorded Future – you soon will.  RF is truly one of the driving forces for good in cyber security.

Few excerpts from my comments in the podcast below :

“I think one of the interesting and compelling things early in the formation of the company was this idea that there was a need for a temporal index on the web. Disambiguating time on the web would be a very, very powerful thing and when we first started out, we weren’t really certain what applications would be most powerful and compelling. For the first couple of years we looked at many different applications in financial services, and security, and intelligence.

Then there was one of these classic “crossing the chasm” moments where Christopher called us all up and said, “I really think that cyber is our thing. Cyber threat intelligence is going to be the most powerful and compelling application of this temporal index on the web.” And boy, was he right.”

“There were times when we were walking down paths that didn’t pan out and the amazing thing about the early team at Recorded Future was there was sort of a commitment to fail fast. No fear, actually, kind of a hunger almost to try lots of new applications very, very aggressively and to not be discouraged if those applications didn’t turn out to be compelling or interesting, either on the technical side or on the business side.

The whole team was very, very deliberate and aggressive in looking, seeking, for this application where these temporal indexes were going to have a huge impact analytically.”

“I felt when I came on board and when Rich came on and Simon in the early days, it was clearly and deliberately Christopher building a team around him that was no different than any other team. One of the things that Recorded Future and Christopher does uniquely is, he actually puts the board to work all the time. It’s very demanding being a board member at Recorded Future because Christopher has high expectations. I actually think this is the next generation of board membership is going to be more like what Christopher does at Recorded Future where, we’re all responsible for actually delivering value to the company and being very actively involved.

It’s not about just showing up to meetings and checking a bunch of boxes. It’s been that way from the beginning and that culture still persists on the board.”

Posted in Uncategorized | Leave a comment

AWS Glue, ETL, and the Persistent Challenge of Data Variety

Yesterday Amazon announced the public availability of AWS Glue which they describe as a fully managed ETL service that aims to streamline the challenges of data preparation. The service was previewed back in December 2016 at Amazon’s re:Invent conference, so while it’s not a surprise to anyone watching the space, the general release of AWS Glue is an important milestone.

The ETL market isn’t going away, but it’s about to get a lot more interesting.  AWS Glue is a big deal, and will be a disruptive force in the traditional ETL market – think Talend, Informatica, IBM, Oracle.  While those committed to hybrid and / or multi- cloud architectures will probably view Glue with some trepidation [insert joke about Glue making AWS more sticky here], it will surely create a lot of value for those who are already committed to AWS, and will also attract new users attracted to single vendor, full-stack AWS solutions. It will put serious pressure on traditional ETL vendors who are fighting for relevance in the cloud. And it will also force competing cloud and PaaS providers to move in a similar direction to try to match Amazon’s new hosted multi-tenant ETL offering.

Amid all this swirl, the general availability of Glue presents an opportunity to reflect on the limitations of the traditional ETL paradigm (cloud-based or otherwise), particularly when it comes to solving today’s biggest unmet enterprise data challenge.  The deterministic nature of traditional data management approaches embodied by  ETL and MDM tools fails to solve the core issue of data variety, the third, and most problematic ‘V’ of the classic view of big data that has historically emphasized volume and velocity. If you’re a shiny new ‘born on the Web’ company, data variety probably isn’t your biggest problem. If, however, you’re a mature, large-scale enterprise that has been around for decades or even centuries, you’ve almost certainly endured waves of technology adoption, acquisitions and divestitures, shadow IT groups, and successive re-orgs, all of which have exacerbated the problem of data variety — multiple data silos, differing schemas and formats, and wildly variable completeness and quality.

Data variety is the biggest obstacle stopping enterprises from realizing analytic and operational breakthroughs, and traditional ETL and MDM tools and their deterministic approaches haven’t helped these companies overcome the challenge of their data silos. Cloud-based ETL won’t solve the problem either; it simply relocated the issue. While it may be economically more attractive and scalable, it shares the fundamental flaws of traditional ETL:

  • Rules break: The logic of ETL systems are based on rules determined by developers and enshrined in code. Deterministic, static rules don’t scale well as data variety increases. A better approach is to use machine learning techniques to create bottom-up, probabilistic models for combining and cleaning data. Not only is this more scalable as new sources get added, but it is also more adaptable to deal with entropy in the underlying data sources themselves.
  • Context is king: Human expertise about data’s business meaning is essential. ETL rule developers often lack the domain knowledge necessary to interpret the data that they are trying to integrate. As a result, they are either forced to make assumptions which can lead to data quality problems when they guess wrong, or they can go off to interview subject matter experts and attempt to codify their findings which is a time-consuming process subject to all the vagaries of human communication. A more effective approach would be to create a data integration system that easily captures and integrates SME knowledge to enhance the results generated by machine learning.
  • Best-of-Breed is Best: DataOps, the practice of applying the same principles behind DevOps to the challenge of increasing analytic velocity, requires a mix of best-of-breed tools that work well together, less like the data management platforms offered by the big, monolithic traditional vendors (Oracle, IBM, Microsoft, et. al.)

ETL isn’t going away anytime soon, and AWS Glue is going to make the market a whole lot more dynamic.  It will precipitate a series of moves and countermoves by incumbents and new entrants alike.  We also think it will shine a brighter light on the enterprise-scale data variety problems that ETL approaches are ill-equipped to tackle.  That’s the challenge that we’re focused on at Tamr, and we welcome the opportunity to partner with like-minded enterprises and vendors who see probabilistic data unification as both a complement to existing ETL and the path to unlocking new value from their data assets.

My Tamr co-founder and Turing Award winner Dr. Michael Stonebraker has written about new approaches to scalable data unification here if you’re interested in diving more deeply into the topic.

Posted in Uncategorized | Leave a comment

Koa Lab’s Support for NEVCA’s Anti-Discrimination & Sexual Harassment Statement

Discrimination and sexual harassment are intolerable in any environment, and they are kryptonite for the diversity that unleashes the innovations that empower startups to change the world.  I was appalled by the allegations in The Information about the repeated episodes of sexual harassment by a male venture capitalist toward female entrepreneurs, and I was encouraged by Reid Hoffman’s quick reaction. Having funded many women-led companies at Koa, I know how uniquely difficult it can be for women entrepreneurs in tech.  Being an entrepreneur is difficult enough, and being a female entrepreneur is harder still–way harder than it should be.  We need to do everything possible to make sure our ecosystem in Cambridge/Boston reinforces the values that enable women entrepreneurs to be wildly successful.

I’m proud to announce my, Tamr’s, and Koa’s unequivocal support for the New England Venture Capital Association’s Statement on Discrimination and Sexual Harassment.  Diversity is a core value at Koa, and we’re working hard to practice what we preach.  I’ve written recently about the importance of diversity here.

I strongly encourage everyone in the New England start-up ecosystem (and beyond) to read NEVCA’s statement, to support their efforts, and to do everything in their power to make their companies inclusive and supportive of diversity on all dimensions.  Let’s be the change we want to see in the world.

Posted in Uncategorized | Leave a comment

Best SaaS tools for Startups Revisted

Fantastic post here from Ariel Diaz @ Blissfully.  I did similar post here a few years ago.  Figured I’d do a quick update on the tools that I like/use at the moment – amazing how quickly the landscape changes.

My updated recommendations:

  • Basic Productivity : Google Apps
  • Accounting : Quickbooks Online
  • Benefits/Payroll/HR  : Namely (previous suggestion was Gusto – changed for a number of reasons – but Namely is the best imho – and can do all benefits/payroll/HR in one system)
  • Expenses : Expensify
  • Recruiting : Lever
  • Equity Management/409A: eShares – I’ve blogged about this before here and Fred Wilson also posted about eShares here
  • AR/AP Automation : (thanks to Dan Meyer for reminder)
  • CRM : SFDC (I’m tempted by Hubspot CRM and  might change)
  • Source code management : Github
  • Issue management : Jira
  • Product management :
  • Design : Sketch + InVision
  • Hosting : GCP – I know a bunch of people will
  • Identity/IAM : JumpCloud or Okta – tradeoffs abound – think of JumpCloud as basic LDAP service vs. Okta as broader IAM solution.

Every small company has it’s own needs but it’s amazing how easy and quickly you can have the basics set up for very reasonable cost.  My biggest request to the SaaS vendors is always the same – PLEASE instrument Google OAuth to simplify login (thank you Expensify and Lever).

It’s also amazing how many big companies – if they moved to systems like the above could RADICALLY reduce their IT costs/spend and increase their scale.

I’m  encouraging folks on my team (biz folks anyway) to buy and use Chromebooks – I have been using a Pixel for 4+ years and it’s great (saw Jeff Dean running a Pixel at a meeting once – good enough for Jeff – good enough for me 🙂  Dual boot Linux instructions here.

@Tamr we have a BYO device policy (+2 year refresh) that works well – but increasingly I’m seeing that biz people @ startups are starting to give up their beloved Macs because if you use all of the above tools – really just need a browser.  ChromeOS is so much easier to maintain (my friend Christopher Ahlberg is giving me the eyeroll) and ChromeOS is  so much more secure OOTB.

Posted in Uncategorized | Leave a comment

Tamr & GE

I’m truly thrilled by the work that we’ve been doing @ Tamr with GE. Summary of the latest and greatest @ Tamr blog here. The winds of change are blowing strong in enterprise data and analytics space. I’m very excited about the potential value to be unlocked as large companies figure out how to manage their data as an asset using DataOps instead of merely treating their data as exhaust.

Posted in Analytics, Big Data, Enterprise Software, Entrepreneurship, Founders, Health Care, Information Technology, Innovation, Start-Ups, Uncategorized, Venture Funding | Leave a comment