Podcast Transcript

The Web3 Creator’s Guide to Blockchain Data with Erik Reppel


Mint Season 5 episode 15 welcomes Erik Reppel, head of Data at Zora, who joins Mint to give us an intro on all things blockchain data: from why NFT project founders and creators should analyze their collector’s activity to how to apply these insights to grow their community.

In this episode, we discuss: 

  • 00:09 – Intro
  • 13:10 – What Does Ownership Mean in Web3?
  • 26:27 – Designing Custom Experiences for Collectors using Data
  • 27:40 – The Downside of Blockchain Transparency
  • 34:44 – Web3 Has a Metadata Problem
  • 41:17 – Where Does Artificial Intelligence and Machine Learning Play a Role in Web3?
  • 47:13 – Outro

I hope you enjoy our conversation.

Support season 5’s NFT sponsors

1. CyberConnect –

2. Coinvise –

3. Mint Songs –

Interested in becoming an NFT sponsor? Get in touch here!

Erik, welcome to Mint, my friend. Good to meet you. Thank you for being on. How’s it going?

Erik Reppel: Good man. Happy Memorial Day.


Yes, sir. Happy Memorial Day. Yeah. So, what turned into a bidding frenzy for a little noun has now turned into a podcast episode. So, I’m stoked to have you on and prior to the bid. I actually, I didn’t know who you were. But when I, I forgot who it was posted a thread as to who was competing for the noun. I was kind of like doing some more research as to who was who and actually came across her profile and found it really interesting. Zora has been represented on the podcast before and for a minute now, we’ve been talking about the data side of things and you’re no stranger, so I’ll shut up because it’s all about you. But before we kind of dive in, who are you Erik, what does the world need to know about you? And how did you get your start into crypto?

Erik Reppel: Yeah. So, I’ve been in crypto for, I first got into crypto in like 2010 like I was like, like on the Bitcoin forums way back in the day. I don’t know if you remember stumble upon it was like browser extension that. I like love stumble upon when I was in high school. I like randomly to hit the Bitcoin forums one day, the Bitcoin white paper and in like 2010, 2011. And immediately was like, wow, that’s super interesting. I was like, 16 at the time. And so, it wasn’t a like technical thing. But I thought it was like, the idea is really fascinating. And, you know, it’s this interesting mystery of like, who is a Satoshi guy. And me and a buddy of mine, ended up buying, like a tiny, tiny, like 100 bucks with a Bitcoin, the coin was like, I don’t know, $90 at the time. And we were like day trading in the back of our like, high school chemistry class on like, a little laptop. And that’s like my first foray into crypto. And then I kind of forgot about it for a couple years. And then, you know, went and did a degree in computer science with a specialization in ML. And like, as I was, you know, a couple years into the degree, I realized, I could probably understand how this stuff works technically now, and kind of went back through it. This was like the early Ethereum days like 2015, 2016. And yeah, an ever since I was at Coinbase for about four years there. I was the first ML platform engineer slash ml engineer at Coinbase. Yeah, led that team for a while and then actually left the crypto realm for three months went to clubhouse for I was there for just under three months, and ended up just missing crypto too much. And now I’m here at Zora as the head of data.

Nice. So early, early career prior to college, were you just amateur early building software and coding? Or was your first kind of like, stab at it during school?

Erik Reppel: So, I didn’t really start coding in earnest until I was about 20. Oh, yeah. My dad is an engineer, though. And I always thought like, I don’t want to do what my dad does. He was a software engineer, of course. And so, I actually went started my degrees as a mechanical engineer. And then, you know, I’d coded a few things on the side. But one summer in after my first year, I really had this idea where I really wanted to build this like IOT device for tracking like dynamic path of a barbell, while you’re lifting as a competitive power lifter at the time, and I needed code to do it. And so, I kind of taught myself how to code and I was like, oh, actually, this is pretty sick. And it’s way higher leverage than mechanical engineering. It’s way quicker feedback cycle. It’s much more addicting than being in cat all day. And yeah, switch my degree never went back.

Wow. bodybuilder, are you like, are you pretty much incisively? I can tell your shoulders are pretty like wide. Do you still do that consistently? Weight lift?

Erik Reppel: Powerlifter not a bodybuilder.

Okay, it shows you shows you my skinny ass arm. How uneducated I am about it. Wait, teach me about that for a sec. What does that entail? So, what’s the difference really?

Erik Reppel: bodybuilding is like how jacked can you look and powerlifting is like how strong you can be and, so, competitive powerlifting is really just how what’s your one rep maximum for a bench press, a squat and then a deadlift and then add those together and you get a total. So, I one point was the um, from encova, I was at one point the provincial champion in my weight class for the junior division in powerlifting, which sounds really impressive, except that it was like when I want to, I was, I came first out of two people.

Came first. That’s all that matters.  Excuse me, what would you say the similarities are between powerlifting and I guess, programming ML? Essentially what you do professionally full time, Like, are there any similarities?

Erik Reppel: Yeah, I think so I think like people who get really good at powerlifting, you end up having to be very data driven. This sounds like a cop out answer. This sounds like I’m telling you, but it’s like, you have to actually get very data driven and how you train because especially as a natural powerlifter, like I was you hit kind of your natural potential pretty quickly. And then a lot of it comes down to like, how good is your training regimen and a lot of things like volume and accommodation is very, like biomechanical and technique driven. And then also, like, can you design a program that fits your physiology correctly to maximize your potential? And so, a lot of that is like, really analytical and like breaking down, you know, sets reps approach. It’s like a whole field.

Yeah. You know, when you were at clubhouse, clubhouse has like a unique point in history in crypto, specifically in the NFT, like early, early NFT days. Blau, Justin Blau, DJ, founder and CEO of Royal I don’t know if he’s the co-founder of royal he basically leveraged clubhouse and I guess like audio-based marketing as a way to kind of point himself into the web two role as an icon of having this glorious drop that he did with the live room. And I just remember clubhouse being like this point in NFT and crypto history as being like this remarkable platform that helped spearhead a lot of education in this space. Right. Were you in at clubhouse during that time or what was your experience like over there?

Erik Reppel:  I kind of entered clubhouse near the end of that time, like as it was kind of, on the like, second peak of the cycle. I had a great time at clubhouse like I really like all the people at clubhouse I saw a bunch of friends who work there. I think the thing people don’t realize is that clubhouse is insanely popular everywhere except for North America. Like they are huge. If you ever want to trip, go search out Persian clubhouse. It’s like huge like, last week, there was a room with like, 20,000 people in it for like.


Erik Reppel: On Persian clubhouse, and it’s the same is true is like India and Brazil and a bunch of other like places that I think once you lose the US like Silicon Valley narrative, people just think that you’re gone. Yeah, internationally, the people forget, there’s like 8 billion people on the planet and lot of it like, especially India, there’s like a culture of you are kind of always on the phone with your family and friends. And so, clubhouse is like the natural product for certain markets that have that kind of like, casual conversational demo. I had a great time at clubhouse. I don’t think people realize this. The other thing I think people don’t realize about clubhouse is that they’ve had like genuinely one of the strongest engineering teams that I’ve seen, like, they were super under water, like during that first Elon peak, I think they had like five employees or something. But they’ve assembled like a really, really like top group of talent over there. So, they’re amazing. I just miss crypto too much.

Once you get into the hole, it’s kind of hard to get out. It’s just, it’s vast. And I remember when I first got in in 2017. And the second I kind of discovered what you could do with the technology and learn more about the white paper, whatever. I was like, this is what I want to commit myself to I don’t know for how long but it’s the only thing I can see. And that kind of brings you after clubhouse, I guess like to Zora itself. And the reason also why I reached out, I wanted to do an episode around data coming from someone who’s been in the weaves of it for quite a bit of time. So, on your Twitter bio, on your LinkedIn, it says your head of data, okay, but what does that mean for those who don’t know?

Erik Reppel: Yeah, so I think head of data means something that you, most people wouldn’t guess. So, most people think data, they think data science. What I do is actually the team that does everything between the blockchain and up to an including the APIs. So, we do indexing, we do search, we’re planning on doing recommendations. We do API design, but everything that actually takes what is on chain, and makes it available for people to use through applications and then those APIs, we’re actually launching our new newest version of our API on Wednesday, two days from now, as we’re filming. Those APIs are used by Zakho like our marketplace and then other people who build both major projects and like, more kind of in these things on top of it. And so, it’s like, there’s a data science aspects and ml aspects that we’re kind of planning on getting to and investing more in the coming months. but really like, the thing I’ve been spending my last like five months on is like building a really powerful indexer. So that we can take all the data that’s on chain and make it, curate it and make it available in a way that I don’t think is currently possible.

Got it. So, for those who aren’t familiar with Zara, or at least from the consumer facing side, I think a lot of creators can understand that Zora or maybe interact with Zora as like a marketplace for buying and selling NFTs, listing NFTs. But on the back end, you guys have this entire vehicle, where a bunch of marketplaces have kind of like built their foundation off of initially catalog sound dot XYZ, if I’m not mistaken, in many, many, many others. So, when you talk about these APIs, I just want to add a clarification. I guess for those who don’t really understand, there’s an entire like engine behind Zora, which, from what I understand the consumer facing marketplace is just to kind of like encapsulate the vision of what it can be. The real product is what’s happening underneath the hood. Did I get that right?

Erik Reppel:  Yeah, that’s totally right. I think a lot of people don’t get that. And you explained it really well. The way I think about Zora is like the layers, right? So, the lowest layer of what Zora   is, is the protocol, which is the on chain NFT marketplace. So that’s a lot of words. What does that mean? It means that, take open sea for example, the settlement of NFTs and the transfers on chain using the Wyvern protocol, which is kind of like what Zora’s protocol competes with in quotes. I think there’s a lot of differences between Zora’s marketplaces and Wyvern, but let’s set that aside, the order book for those markets is actually stored in open seas database. So, you can’t actually see what people want to buy or sell on chain was something like open sea right now. The bizarre like all of the data, like who’s, who’s been on what, who’s asked what, what options are available, all that lives on chain. And so, you don’t need access to open seas API in order to like to build these interfaces. And so that’s kind of like the lowest layer and then what my team works on is making all that on chain data available in a way that’s like not crazy, like it is hard to read blockchain, it is still, and it’s I think, kind of a niche skill. But people know how to build, people know how to use REST APIs and how to use Graph QL. We try to make that data more palatable and easier to use and also, index every NFT, like find all the NFTs make that data available, as well as on market data. And then like, if we go one level up the stack, we have like a community tools team that tries to make those API’s as easy to use and like as low code as possible. My team focuses on like, how can we expose and have the most amount of data, community tools say like, hey, let’s take that complexity and try to roll it up into like opinionated, nicer tools that are a little higher level. And then zordac increasingly it’s kind of this, like, roll up of all those three layers of the stack that kind of shows you what you can build using all the tools that we have in the open. But it’s like this vertical approach versus this horizontal approach that some of the places are taking.

What Does Ownership Mean in Web3?

A lot of creators understand, or at least try to understand and crypto and web three, from the NFT point of view, a lot of the narrative around the web three and why it’s so powerful is because it’s the ownership layer to the internet. And you coming from a data background, I think you would understand this kind of like this thesis better than anybody else, right? Because you’re actually playing with the data that people own. You’re indexing and creating APIs with what people kind of own on chain. So, from your point of view, what does ownership really mean on chain? Right, I guess we can just start there.

Erik Reppel: This is a really good question that I think, so what is ownership, right? The only thing that you actually own in an NFT contract is an effectively like a row in a table in the contract. So, when you have a collection, there’s what’s called like a token URI, you as a person, you as an address or a wallet can look up what your fit, what your pieces within a collection or a contract are. And so, you own whatever is at that token URI. But that token URI, oftentimes links off chain, and is on like some centralized server or links to an NS3 bucket or something like that. And so like, technically, the thing that you own is just the thing that’s on chain. And I’m glossing over things like on chain NFTs and on chain metadata. But there’s a ton of stuff that if someone who isn’t you decides to stop paying their AWS bill, like it’ll just disappear. And that’s like, I think a lot of like the JPEG era NFTs aren’t actually decentralized. Like they’re either surfaced through other people’s API’s or they’re just an NS3 bucket or wherever they may be. But you don’t actually, if you do you own that JPEG in an NS3 bucket that you don’t pay the bill for is is kind of a like, philosophical question almost.

And so, I think like, eventually, over time, especially in a bear market, we’re gonna see more projects lean into, on chain and on chain metadata on chain SVGs, things like IPFS, for hosting as an alternative to NS3. Because let’s take an example, right? If you have a project, and the artists mints, and all their assets are in an NS3 bucket, and they, you know, get hit by a bus, and their state doesn’t cover their AWS, oh, that bucket is getting deleted and your content, maybe you made a copy of it, but the thing that’s on chain, links to that bucket. And so, you may have the copy, but you have no way of proving like, I swear that like, this is the JPEG that was linked in that bucket that we have. But if it’s IPFS, like, you know, IPFS, the interplanetary file system, a decentralized storage layer, you have the ability to, even if the artist, you know, vanishes, you can still keep paying for the pinning of that content. And like, the link will resolve, so to say. But I think like, from an IP perspective, this is kind of like an interesting rabbit hole that like, I’m not an IP lawyer, but I think that IP within NFTs is also like, super under explored right now.

So that’s like also another huge discussion, for example, the music NFT side, like when you buy music NFT what do you actually own, right? And you have projects like royal, like decent, well, this goes on and on, that are trying to tokenize IP rights or fractionalized IP rights, right, and the associated collectible to buy with it, you also get whatever, I guess, like real world value, kind of like a crude with it. And even then, like people still don’t really have an understanding of what they’re buying, right. And I was kind of like, at fault for that too. When I initially started buying things, I was like, okay, like, if I buy the NFT, then the song is in my wallet, you know, so I own the song now, right? But the funny thing is, is like the real world and the online world are not always connected, right from like a legal and IP point of view. So, having a definition from your kind of like perspective on what ownership is actually kind of like really helps the next, like piece of this conversation, because part of owning your data, right? You should be able to do cool things with that, right? And I feel like it’s a subject like that element of doing cool things with the data that you own, is really under explored, creators, they have a band of collectors online, they have anonymous wallet addresses. But underneath that, they have layers and layers of information that they kind of can tap into on their individual collectors, which all shut up. And I’d love for you to kind of explain this kind of like this, what this call, about to go on, like a rampage, like my love for this type of conversation, but take it away.

Erik Reppel: I want to hear your thoughts on this too. I feel like he had some good thoughts here.

Yeah, go ahead.

Erik Reppel: I think like the fundamental thing is, like, a lot of the state is actually hard to get on chain. And it’s hard to get period in from APIs and stuff. And so, I think the barrier to entry for a lot of artists is higher than it probably should be. And that’s where we’re trying to, you know, build some of these products and things that make that easier. The other thing is, a lot of these things are only licensed for like noncommercial use, and you can really make an argument that you know, making a derivative Mint is commercial use. And so, this is why I think like things like nouns and more like CCO projects like Creative Commons projects are really compelling because depending on the like your gradients, as an artist or creator depending on your like where on the decentralization spectrum you want to fall, I think there’s like a more or less correct model for you. And so, the, I think if you’re if you’re BTS, right you have a billion fans you have no use every piece of your IP is worth just so much. You probably just want to do like a hey, this is an NBA top shot equivalent like you have it’s like a trading card kind of thing. It’s commemorative, if you’re trying to bootstrap community you, I would argue like nouns has proven you want to go full CCO, no one owns it, everyone owns it, anybody can do anything with the art commercially, non-commercially, personally, whatever, make derivatives everything because you want to propagate a mean as far as it’ll go and I think there’s a middle ground in there where the, if you’re an independent artist, you may want to use something like royal I think royal makes a lot of sense, where as a fan, you’re almost, I don’t want to call it speculating, but you’re investing in the artists, if you think If that they’re an artist that’ll blow up and increase the streaming revenue, your cut will like appreciate over time, it’s very close to an investment asset, which I’m sure the royal legal team wouldn’t appreciate me saying, but I think like, people don’t think enough about where they want to fit on that, fit on that spectrum. And then I think that the technology side of it, right now, it’s not easy enough to take advantage of that composability. And that, how you can kind of remix different platforms and different concepts, because a lot of the state isn’t open.

Yeah. So back on that subject, because BTS knows their Spotify data to an extent, they know their social media web two data to an extent. But they still don’t own that data, right. And with not owning that data, you risk the platformization, if that’s even a word, where you see this happen all the time, when you see people build an audience on tick tock, and then for whatever reason, they did something that could be like gray area that got them removed and banned forever. And so goes, they’re following all their audience and everything. And, you know, you’re asked about my point of view. So, I issue free NFTs for my listeners to collect every single season. I’ve done it since season two. And these are free NFTs, free to mint, or at least almost free to mint. And for season four, I made them nontransferable, I made them soul bound. Because in the future as the podcast grows, I envision me doing something really valuable with me being able to prove who are my additional contributors and participants, right. But my dilemma as a podcaster, as a creator is, while I see who men and women from Spotify data and Apple Music data are, through Buzz sprout, I see the type of traffic that I get on my website, and Google is able to decipher what’s male, what’s female, the trends of audiences, where they came from, heat maps, etc. I know nothing about who’s collected my free NFTs online, right? When in reality, like all the information there.

So, for example, okay, I want to know how many of them are holding board apes. Okay, I want to know how many of them also hold I don’t know, FWB? What poups have they collected from Eth Denver and from Eth Amsterdam, etc. I can maybe tell if they’re male or female based off the assets that they hold, for example, are they in boys club, which is a female only club, right? And I can tell how many of my collectors may be female. So, if I start seeing all these, like these trends and these touch points, I can then create better tailored content for them, I can bring in better sponsors, because if I know let’s say, I don’t know, 70% have at least one Eth in their wallet, then I can create interesting, curated experiences for those early supporters from there on out, right. That’s kind of like my mentality and my understanding about this as a as someone who doesn’t code, who doesn’t know how to query, use sequel from like doing analytics, for example. And I don’t know how to kind of propagate that custom data for me. But that’s kind of like my, my understanding from a high-level point of view. Am I in the right direction here? What am I missing?

Erik Reppel: Yeah, I think you’re losing me for a second, though. And you’re like, I want all this demographic data. But you brought it back around to like, exactly. So, I think like the real power of these things, right? Where you, the tradeoff of on chain data is, is it’s fully open. But it’s higher intent. And so, the reason why Google can kind of tell, you know, gender and all these things is because they associate many different, you don’t enter your gender on Google typically, right? You, they have a model that kind of predicts what you are, I assume that if I was Google, I would have a model that predicts who you are based off your activity, gender being one dimension to potentially who you are. And the all the other things like how targetable you are, how like potentially valuable you are as a like recurring user, all these things are like things that you can model based off of aggregated activity. Whereas on chain, you don’t have clicks, you only have the highest intention things because someone has spent money or gas to do that thing. And so, in a way your users are like way more valuable if because they’re higher conviction for the things that they’ve done. But you have way less some aggregate data. And so, to your point, I think right now, there’s one deficiency, that tooling is really bad. And this is why, you know, we’re why we ended up building a full API instead of just relying on stuff like the graph or like other APIs is because we think that we have a take, that’ll make some of those things a lot easier. And over time, we can start to expose that power and like no code and low code solutions. But you have to start with access and like availability of that aggregate data. You have to like to have everything before you can make it available. So do your like quarry of how many of my users hold board apes.

That’s a thing that you can totally do from this API not to pick up my own book. But the thing you can totally do from our API that we’re launching on Wednesday. And I think that the ability for, to like easily get things like that is going to make a lot of experiences a lot more robust, because you’re going to now be able to know like, what’s in your wallet is relatively easy. But there’s a whole bunch of like, extra steps that you described. They can do if you just have more data. And yeah, I think like this, like centralization, decentralization tradeoff is like really underplayed, especially for creators, minting because there’s no, even on chain like you can know if someone owns up like an ape, from within a contract. And so, if you’re a developer, you can write a contract where every mint that is held, every mint, by a wall that’s holding this type of thing gets like some, you know, at mint time special treatment, similar to like, the similar names, right? I think things like that, where you actually compose ownership and tailor experience based off not only what are in people’s wallets, but like, aggregates for, like, both viewing websites, and also like, on screen experiences is like very, very under explored.

Designing Custom Experiences for Collectors using Data

So how would one creator kind of go by doing that right now? Like, how can a creator, let’s assume a creator that doesn’t know how to code? Okay, how can they go by creating these custom experiences that you and I were just talking about, for their collectors using that data? Like, what’s the best way to do that right now?

Erik Reppel: I don’t think there’s a good way of doing it right now. Like, I think that the, a lot of the tools that exists currently are very limited, and they’re very kind of like, let’s publish your catalog of JPEGs. And to be honest with you, I don’t know if you can make a local tool with as much power of this unless it’s a very, very specialized local tool. And the I think those very, like hyper specialized local tools tend to have a lower tam than, like more general tooling, right. And that’s kind of like the market reason why you see things like open seas, like lazy mint, but you don’t see open seas like complex mint. There’s no equivalent. So, like, I think, theoretically, you could make a more generalized tool that hybridizes both, but it’s, I don’t think that this is an underexplored niche in the market. And I think that someone will probably fill it at some point.

The Downside of Blockchain Transparency

Yeah. So, I also want to talk about, like, the pros and cons of data being like, inherently public, right? And does that sway us to more of like a dystopian future of some sort. And I only bring this up, Erik, because in web two, platforms are inherently really powerful because of the data that they collect on their users, right. And also, the shelter of that data. Spotify collects a ton of data on their artists, streaming data, etc. But they only provide so much to the artists themselves, right. And it’s like, it’s like a big complaint in the industry. While there are dashboards to kind of understand it’s very limited from what I understand. In Web three, everything is inherently public, and you’re relying on entrepreneurs to build these tools that kind of contextualize the blockchain. But with that can kind of come its pros and cons. And I’m curious, from your point of view, like what are, I guess more of the cons, because we talked a lot about the pros and the benefits that come with a transparent database. Were more of the cons of kind of like making data more inherently public. What are your thoughts around that?

Erik Reppel: I think I think the biggest con is also the Pro that I mean, you can see everything right. And so, if you can see everything, people know exactly who you are, if you’ve doxed your wallet, they know exactly what you’ve done. I think there’s a lot of like, there’s a lot more sophistication about around doxing wallets. And I think people anticipate I think services like chain analysis and Coinbase analytics actually work really well, if you’re high intent. And so that’s kind of the you know, you assume Google has all your data and you assume that, you know, if you’re an admin within a Google system, you can probably read some of your data, but everyone is effectively a read admin on the blockchain which is maybe okay because the data is, it’s less, it’s more consented right? Everything that you, every time you interact with the blockchain, you know that the thing that you’re doing is going to be on chain, it’s going to be associated with your wallet. And so, like depending on how comfortable you feel like you know, I’m sure a lot of people, yourself included, myself included, have kind of like doxed wallets and then like, semi doxed or fully undoxed wallets. And you can kind of like opt in into your level of privacy and bring whatever identity with you to whatever place you want to go. So that’s like, quite nice. But if someone manages to dock your wallet, they can see everything that you’ve done. And so, there is that level of concern. And yeah, I mean, most of its like, probably the worst part is like, oh man, like you did some, like cringe thing, hopefully down the road like, that’s like cringe in retrospect. But like, we’ll see, right? It’s early days. And you know, that was a dark place if you?

Yeah, well, I feel like the ultimate recipe is connecting, like, on chain data to off chain data, for example, associating like social profiles with anonymous addresses, like, that’s when it kind of gets scary. And honestly, I don’t think we’re far away from it. I don’t know of anybody doing that. But you’d have better context to this. I feel like then, then I would, but I feel like that’s like the red line kind of thing.

Erik Reppel: Yeah, it’s, all these things are a spectrum, right. So, if you’ve done a poor job of doxing, your wallet, or sorry, if you’ve done a poor job of obfuscating your wallet, like, you can probably be doxed by a very motivated actor. There are some ways of you know, spinning up a wallet and funding it that are pretty hard to track, to trace, especially for smaller amounts of currency. You know, tornado cash and stuff. But ultimately, I think no one really cares. Like, unless you’ve broken the law or done something sketchy, like transferring from an exchange to a new wallet you spin up is like probably enough security for 99.9% of people. You know, coming from the Coinbase hot wallet to a new meta mask. It’s like, oh, well, Coinbase hot wallet, that can be one of 90 million users. And so, it doesn’t mean anything. And that’s enough anonymity to, you know, be comfortable. It’s like if the FBI wants you or something or, like they might, you know, subpoena Coinbase or something crazy. But ultimately, like, if you haven’t broken the law, I think that’s like enough. But there’s value in being in public too. That’s how we got to this conversation, right? The I think, having a kind of gradients of anonymous, anonymity, is that word?

I think so.

Erik Reppel: Yeah. Anonymity. That’s the word.

Oh, yeah. Well, I’ve been saying it wrong all along.

Erik Reppel: I think it’s good, like it’s, but the key is, like, be able to opt into the level that you want, and being able to, like see each experience being able to accept multiple logins, right? That’s the value of being able to bring your wallet anywhere.

Web3 Has a Metadata Problem

I remember, Season Four was all about the music industry. And I remember having like the catalog guys on, the mint songs guys on, I had Blau on, a bunch of people. And a big point of, I guess, conversation was metadata in web three, and how metadata is very much of a mess. And this vision of creating this music NFT application, where you can organize all these different songs in a very seamless, intuitive way, is actually quite difficult, because the songs that are being published in the form of NFT are not organizing categorized correctly, so that indexers can kind of like, eat all the data and spit it out in a very like, I guess, user friendly manner from what I understand. And I guess, like, what’s your perspective, in terms of how data is kind of comprehended and written at least in the form of smart contracts or web three in general? And I know, I’m butchering that question, because I don’t have the best insight as to what these like terminologies are, and hence why you’re here. But I think you know where I’m coming from.

Erik Reppel: Yeah, totally.

And help me rephrase that question, too. Yeah.

Erik Reppel: So, I think like, let’s do any in between on chain and off chain. So okay, off chain metadata is typically off chain, but the, sorry, are you familiar with like, EIPs?

Remind me for those who don’t know.

Erik Reppel: So, EIP is Ethereum improvement plans, they get turned into ERC. And so, whenever you see like ERC 721 which is like the NFT standard, or ERC 1155, ERC 20. These are like improvement proposals that get turned into the things that people actually do. And those become the standards that everyone builds on top of. So, there’s EIP is around pretty much everything that’s on chain and so on chain it is like relatively consistent. It’s not, it’s a dark forest and it’s really not as good as you would hope it would be, a lot of people don’t implement ERC 721 correctly, this is what I learned building an indexer. But at least there are standards that people are supposed to adhere to. Metadata, there’s really no standard. There’s kind of like the open sea standard, like the way that open sea publishes metadata and there’s a there’s a few others out there. But metadata is a mess. Because of the lack of standardization, and no one really, there being anyone who mentioned NFT can basically make up their own metadata standard. And then it’s like, how do you access that data, like I said, like every NFT is actually just a reference to a token URI. And so, if that token URI point to a server that you need to grab a JSON file from that contains the metadata that links to an image and an audio track. And that’s like the happy path, right? The NFT could also be on chain metadata, which is something I personally am a fan of where the metadata is actually encoded into the token URI. Which sounds ominous, but it means that it’s actually on chain instead of being off chain and you having to, like, rely on if you remember that server going down the example, I gave it’s, that applies to metadata too. And then there’s the like, what’s the format? What’s the schemat? Does it have attributes? Does it have, if it’s a song, does it have album? Does it have content? What do you do if it’s a gift, all these things? And there’s no there’s some standardization, but like, no, nowhere near enough. And that’s why it’s hard to build indexes, because it’s easy to index on chain data, because of the standards. It’s really hard to index metadata well, because there’s less standards, and it can really be anything. And if you try to, if you try to build an indexer, and an artist is going to be like, why doesn’t my NFT show up? You’re gonna be like, well, your metadata is, like, just you made up the standard, like, how would we be able to parse your metadata? There’s like, no, and because NFTs very rarely linked directly to the media, they link to metadata that links to the media. These, it’s hard to like to get a render, which is like, I think an underappreciated aspect. I can give you too, but.

Yeah, go for it. Go for it. Yeah, that is useful. And the on-chain rant, what does that entail?

Erik Reppel: So, I think that people are bad at events overall. So, do you kind of understand what like a contract event is?

Assume I do, but those who don’t explain what an event is? Yeah.

Erik Reppel: So, when you’re writing a smart contract, there’s two ways to access data from a smart contract. There’s functions that you can call on the blockchain that returned data to you. This is called like, read functions. And then there’s events that these contracts emit as they do things. And so, if you want the current owner, here’s an example, right? If you want the current owner of a token, you can just call owner of on an ERC 721 contract. And that involves like calling an RPC node. And then it would like a token ID and it’ll return the address to you. And that’s really easy. And like, you have to have an RPC node. But that’s, you know, Alchemy. And all these services are really good. It’s not that hard. Events are, as the developer writes the contract, they write events, which are emitted by the contract like this is code that they write that then emit data from the contract. And so, for example, when you transfer a token, there’s what’s called a transfer event, which says, It’s an event that says from this address to this address this token it was transferred. And the transfer event is kind of like a tricky one. Because the way that events work in Ethereum, the transfer event for an NFT looks exactly like the transfer event for an ERC 20. Like a fungible token. And so, if you’ve ever seen wallets that says like you have this balance of NFT, and like with an integer next to it, that’s the token ID because they think that your token was a fungible token instead of a non-fungible token.

Got it.

Erik Reppel: But because your ERC 721 is like very well, Spec and events are part of the spec. NFT events are pretty good. markets don’t have specs, like markets don’t have a de facto ERC. And so, what events come out of markets is like all over the place. And so, if you’re trying to understand like, how much someone paid and in what currency and everything it can get, like really, really gnarly. And so, it’s my kind of like, takeaway point to all of this is that and I think we’re gonna probably as well write a blog post about this at some point. Right now, everyone optimizes contracts for gas, but optimizing your contracts for like programmatic readability is also really important both from a like what read methods do you expose on your contract and also like, what events you emit?

Got it. So as an industry, how do you imagine everybody kind of getting more on the same page with a lot of these standards with organizing everything? What’s kind of required to get there?

Erik Reppel: Time. I think, I think it’s just an immature industry, right. And People talk about things like HTTP. If you think about it every ERC, or many years, these are kind of de facto protocols for different things. Right? NFT’s ERC. 721 is a protocol for non-fungible ownership. It takes time to develop these things well, and there are plenty of optimizations that can be made. And I think it takes time for people to get on the same page as to how to structure things because everyone has an opinion. And over time, I think convenience makes those opinions converge. And so, I think one day would be great to have, like, EIP is around like, okay, do you have an NFT image? Does that image have attributes like, this is how your metadata should be structured? Those like, the tricky thing is that those aren’t actually Ethereum improvements. So, they shouldn’t really be Ethereum improvement proposals. They should almost be like web three consortium type things, where it’s like, a metadata encoding standard for any chain, right? Because every chain, not just a Ethereum runs into this problem. Just because you’re an NFT on, doesn’t mean that you have the metadata problem, doesn’t mean that you don’t have the metadata problem. That was the sentence.

Where Does Artificial Intelligence and Machine Learning Play a Role in Web3?

Got it. Got it. So, another point of expertise, which I wanted to have you on, again, is because your machine learning background, and AI, and ML is something that I know very little about. But I feel like when the two, when the three worlds kind of collide ML, AI and web three, we could have some pretty interesting stuff happen. Specifically on the data side, I’d love to get some context and help me understand this better. All this data that you’re indexing., where does AI, where does ML kind of come into play?

Erik Reppel: For sure. So, I think for NFTs right now, there’s two. There’s a bunch of like financialization applications that you can use ML for and, you know, projection, prediction, etc. I think the most compelling use cases near term, search and recommendation. So, searches, search kind of falls into two steps, it’s retrieval, does this thing match this thing I’m looking for. And then it’s ranking. So, an example of this is, let’s say I search for ape, right? There’s a million tokens now that have ape in the name or in the metadata or somewhere, are somehow associating themselves with the word ape, because boards are so popular. How to rank those things is where ML comes in. So usually, you can use machine learning to understand like, based off of these features that you derive about the token, for example, a really good feature might be like the age of the token, how long it’s been on the blockchain. Another one might be like number of transfers, might be a really good feature. You can derive rankings based off of what is, what’s probably relevant to a user when you’re showing them a list of the results for the search term ape.

The second thing is recommendation where for a user, what else is relevant to that user? So, for example, like, if you’re a person who collects rare NFTs, like everything in your wallet, is like, you know, top five percentile rarity NFTs from various collections. You should be recommended tokens from collections that are rare or have rare attributes, right? And that’s the thing that you can actually fully understand based off of metadata and derive that like, hey, this is an interesting feature. This is an interesting item for a user instead of showing them, if we’re trying to show them something from, you know, what’s a random collection? If we’re trying to show them a crypto coven, we should show them a cryptic coven with a rare attribute rather than a common attribute, because they’re willing to pay more for a premium for that piece, right. So, I think those are the two like first and like, biggest things that aren’t quite cracked yet, we’re taking a crack at that. Our search is currently really good at retrieval. I think the recommendation needs a little bit of or sorry, the ranking needs a little bit of work. And then we’re planning on doing some stuff around recommendation in the future.

Got it. Got it. I’m trying to think about what other questions to kind of pick your brain about I mean, what else are you seeing, I guess on the data landscape that you think creators should be more aware of, should be more alert to whether it be tools, whether it be ways as to kind of grow a community using data? What are your thoughts around that?

Erik Reppel: I think I think growing your community using data is really powerful. The finding like who your target audiences is really important I think, it’s harder than it sounds. It’s not like it is with like ad targeting where you can just say like this is the exact type of person I want to identify and pitch to. But I think that there’s a lot of value in creators trying to become a little bit more technical. Like, for example, learning sequel is not that hard. And something like doing will let you answer a lot of questions that you may have about who owns your stuff. And it’s like, I think in an afternoon, if you just take a sequel course, you’ll be able to query for how many of your token holders also hold a board ape. And it’s, I think a lot of people see the technical side and get really intimidated. And it’s actually, I think, more approachable than I think people would generally expect. But the problem of being early days and why being early is so alpha is that it requires more depth and technical knowledge. And so, if you want to be a on the bleeding edge, creators are like, fairly tactical, like Blau, like actually very technical. So is ERC, so, a lot of these artists, Latasha is very technical. But there’s a lot of these artists are like, they may not, not all of them, like are writing code all the time. Some of them are. But they all like, understand how these systems work very fundamentally, we’re at a level that I think is deeper than your average consumer. And I think, especially during a bear market, the thing that I think people should really be doing is trying to innovate with the form. And like stretch themselves to do not just community building but try new ways of doing community. For example, like, you know, you could have a mint, you have a mint season pass, you could easily have a mint membership card that gets you like early access to things and like, gates your community, but right now that requires some technical knowledge that isn’t like super easy to have. But hopefully, we’re gonna get there. Hopefully, the tooling gets better. Hopefully, these things become no code or low code. And people can kind of compose these experiences that use like, hey, this is who is the person right now? Oh, they’re X person? Well, then we should show this or then we should let them into this new mint, or we should do whatever it is. But those are, they’re hard to do right now, just because they’re inherently technical. Yeah.


Well, listen, I learned a lot throughout this conversation. I think before I let you go, tell me more about what’s coming out on Wednesday with soar and this new API that you guys are pushing out, what can we expect from there.

Erik Reppel: So, I think we have a new API out, if for the developers in the audience, we have a new Graph QL API, it’s got a lot of really cool features, it’s got, using our API, you can effectively see for a token, every like sale that was settled on chain, including how much they paid to mint it. And so, you can actually see, like, for an individual token, here’s the start value, here’s the first sale, here’s the second sale, yada, yada. You can do all the regular, like, here’s what’s in a wallet, here’s what’s in a collection type thing. But I think we’ve managed to do some really powerful stuff around like transcoding media, and metadata standardization that should help a lot of people build kind of these experiences. And then I think we’ve done a lot of interesting data level stuff for analytics and being able to understand, like price over time and history of NFTs. So, I think that’s gonna be, I’m excited for it, I think it’s gonna be really cool. I was telling someone; I think some of the inevitably is gonna build a tax tool on top of it. But that’s kind of cool that I think we can both be the underpinnings for like an artist trying to do a drop and also be like a tax person’s like, you know, fantasy scenario.

Interesting. Yeah, I guess a whole, I’d even argue that a lot of the tax applications are incredibly early and need a ton of improvement. So, I’m excited to see what you guys roll out. And I guess, Erik, before I let you go, where can we find you specifically, and kind of learn more about what you’re doing as well?

Erik Reppel: Yeah, definitely. The best place is Twitter. I’m at programmer on Twitter.

It’s a good, that’s a good handle.

Erik Reppel: I wish it was an NFT.

Love it, Eric. We got to do this again soon. Thanks again.

Erik Reppel: Yeah, it’s great chatting, Adam. Thank you.