Skip to main content

Mixing layers and a 1 character error

 I recently spent a great deal of time resolving an issue that was make much more difficult because the software architecture had chosen to mix different layers of responsibility.

The Error

[ActionsApi] [Get] entity conformance check failed: document of type class org.apache.openwhisk.core.entity.WhiskTrigger did not match expected type class org.apache.openwhisk.core.entity.WhistActionMetaData. o.a.c.c.Controller$

Huh?  What?

This error came from a Kubernetes installation of OpenWhisk. Due to a configuration error, this installation was not running serverless actions correctly.  Because this was in a new environment, it was not possible to simply revert the changes I had made back to the previous version - it would not run in that configuration either, so I had to put a lot of effort into determining what had gone wrong.

The database

Serverless functions are stored in a CouchDB instance.  Because this was a new install, one avenue to explore was the possibility that the action was actually stored incorrectly.  In other words, perhaps the installation steps ran incorrectly and stored the action in the database in a way that generated this error.

After figuring out the mapping by looking at the access logs for CouchDB, I manually got the action out the database with 
curl [database host]:5984/test_whisks/whisk.system%2Fsamples%2FhelloWorld

Bummer, the JSON that was returned had the action right there: "entityType":"action".

Perhaps, there was also a trigger with the exact same name:
curl [database host]:5984/test_whisks_all_docs

Nope, no triggers with the reported name, so for now, I conclude it's not the database.

Find the Code

In ApiUtils, I found the part of the code generating the error:
  /**
   * Waits on specified Future that returns an entity of type A from datastore. Terminates HTTP request.
   *
   * @param entity future that returns an entity of type A fetched from datastore
   * @param postProcess an optional continuation to post process the result of the
   * get and terminate the HTTP request directly
   *
   * Responses are one of (Code, Message)
   * - 200 entity A as JSON
   * - 404 Not Found
   * - 500 Internal Server Error
   */
  protected def getEntity[A <: DocumentRevisionProvider, Au >: A](entity: Future[A],
                                                                  postProcess: Option[PostProcessEntity[A]] = None)(
    implicit transid: TransactionId,
    format: RootJsonFormat[A],
    ma: Manifest[A]) = {
    onComplete(entity) {
      case Success(entity) =>
        logging.debug(this, s"[GET] entity success")
        postProcess map { _(entity) } getOrElse complete(OK, entity)
      case Failure(t: NoDocumentException) =>
        logging.debug(this, s"[GET] entity does not exist")
        terminate(NotFound)
      case Failure(t: DocumentTypeMismatchException) =>
        logging.debug(this, s"[GET] entity conformance check failed: ${t.getMessage}")
        terminate(Conflict, conformanceMessage)
      case Failure(t: ArtifactStoreException) =>
        logging.debug(this, s"[GET] entity unreadable")
        terminate(InternalServerError, t.getMessage)
      case Failure(t: Throwable) =>
        logging.error(this, s"[GET] entity failed: ${t.getMessage}")
        terminate(InternalServerError)
    }
  }
But, because it's a function, Scala will be inlining the call and there will be no ApiUtils.class file to do debugging on. I ended up adding additional debug code to the function, but the reasons why and the complications of the environment are lengthy enough to have their own post.

Spray 


With the additional logging on, I got to a new error from the JSON deserialization library spray:
spray.json.DeserializationException: requirement failed: memory 0 MB below allowed threshould of 512 B.

Again, huh?

The spray library allows clients to inject parsing rules and in this case, the OpenWhisk injects a memory size test to ensure that actions don't use too much memory.

Ok, quick search the code for a 0 in a configuration file.  Nope.

Line by line

Now that I knew I was looking for a size check, back to all of my environment changes to see which one I messed up.
Checking each and every configuration file change for the new environment found the change that cause the error. The sequence leading up to this error:
  1. The new environment I was deploying into had different min and max memory sizes for Kubernetes pods.  OpenWhisk tries to create a pod to run an invocation and it has to meet the namespace quota limits.  So double clicked on the 128 and changed it to 256.  But the editor also selected the "m" so 128m became 256, and because I intended to change that line, I missed that the m had been dropped.
  2. 256, when rounded to megabytes is 0 - this is where the zero came from.
  3. Spray swallowed that exception without reporting it and converted it to a deserialization error
  4. OpenWhisk caught that exception and then, tried to deserialize the action into other types and it worked for a trigger action with does not have a min/max memory setting
  5. OpenWhisk then complained that the type was incorrect

Boundaries Broken

Lessons we can learn from this: 
  • If you are going to mix business logic (in this case actions should have limited sizes) into a parser, you need to verify that you can at least get that error reported. 
  • If you are going to re-throw an exception with a new type you should pass along the cause so that the logging layer has the chance to report on the nested error
  • If you have some parts of a code base that allows for parsing errors and retry logic and you have strict type checking elsewhere, you need to do some flow analysis to make sure you get the right error.
  • Finally, as always it's usually the change you made and not the system.  True, in this case the reporting of the error was very difficult to track down, but it was still my own typo that caused the whole chain reaction to start.

Comments

Popular posts from this blog

Spring Boot native builds when internet downloads are blocked made simple

 No direct access to the internet If you work at a company that controls their software bill of materials, it's quite common to be blocked from directly downloading from: Maven Central Docker hub GitHub (the public parts) Getting the bits Maven Maven is first, because without it, you won't be able to compile your Spring Boot application, let alone move on to turning it into a native docker image. I will be showing changes need to work with artifactory, but you should be able to adapt it to other mirror solutions.  repositories {   maven {     name = "central"     url = "https://artifactory.example.com/central"     credentials {       username = "${project.ext.properties.artifactory_username}"       password = "${project.ext.properties.artifactory_apikey}"     }   } } With this configuration change, you should be able to download your plugins and dependencies, allowing you to compile and ...

Kotlin Notebook when you're blocked from Maven Central

 TLDR; If you are blocked getting to maven central when first using Kotlin Notebooks because of company firewalls, you can use a tool like Fiddler Tool to redirect to a different network location. Kotlin Notebooks Kotlin Notebooks are a JDK based environment that brings the Python based Jupyter Notebooks  expressiveness to IntelliJ. From the blog post announcing the plugin, it looks like this: At home, the installation of jar files looked like this: I played around with it at home, but I couldn't use it at work.  Many companies, mine included, do not allow software components to be used when downloaded directly from the internet. In my companies case, we use a product called Artifactory, which allows you to mirror the content from Maven Central while still applying policies like CVE scanning, tracking, etc. The way it should work IntelliJ, as one of the leading IDE's, generally supports this quite well.  In fact, there is a whole setting page dedicated to dealing wi...

Active vs. Passive Log4jShell remediation

 Log4jShell  All computer professionals should be aware of the Log4jShell ( CVE-2021-44228 ) and it follow on defects.  There is no shortage of opinions and lessons to be be learned: The difficulty of performing safe interpretation The problems when assumptions are not clearly documented.  I, for one, was completely shocked to find out that a logging system would actually attempt to do variable substitution in an actual message. The difficulty of finding and resolving issues with such a common library that is not provided by an OS package manager. IT'S A LOG4J CHRISTMAS One of my favorite podcasts, Security Now - episode 850 , discussed an analysis by Google regarding the depth of log4j dependencies.  From the show notes : One contributing reason is because Log4j is, more often than not, an indirect dependency. Java libraries are built by writing some code which uses functions from other Java libraries, which are built by writing some code which uses functions f...