Building a knowledge graph in Logseq

This is a generalized knowledge graph proposal based off of the work on Discourse Graph by Joel Chan and David Vargas, so thank you to them. Other inspiration is listed in: # Resources

Goals

Easily create a relationship between two blocks.

It should be possible to create this relationship inline and on any page. This is in order to not disrupt the note-taking process and make it useful for as many different note-taking styles and systems as possible.

Relationships should be visible and human-readable in the underlying markdown for maximum future-proofing.

Surface and query these relationships.

When on a given page, I should be able to see links to all the notes that have a relationship with that page.

I should be able to query based on relationship, and be able to chain multiple queries together.

A system that supports integrating knowledge graphs together in the future

The current syntax and relationship system should be ready to eventually support plugging two knowledge graphs together by creating a translation between their grammars.

A basic, no-plugin version

Syntax: relationship + arrow

Support the following syntax for aspects (term to describe a “subject - relationship - object” phrase)

- [[some evidence]]
    - [[supports]] ->
        - [[a claim]]
- [[claim]]
    - <- [[supports]]
        - [[some evidence 1]]
        - [[some evidence 2]]
a claim (on the page)
- <- [[supports]]
	- [[some evidence]]
some evidence (on the page)
- [[supports]] ->
	- [[a claim]]

Pros

The arrow solves the problem of dealing with complements

Arrow provides a language agnostic relationship direction indicator

Uses existing block hierarchy structure already core to Logseq

Familiar to Discourse Graph users

Cons

Less readable when the arrow is pointing “backwards” since default is not to create a separate phrase as the complement

Queries

I haven’t finished converting these to Logseq queries yet. Can see my work in progress here: Knowledge graph queries.

x relationship ->: e.g. on evidence page, get claims it supports

1

for block x (e.g. current page)

for every block r that contains a page reference to x

get child block c of r

if c contains -> AND a page reference to [[relationship]]

return all children of c

get parent block p of r

if r contains <- AND a page reference to [[relationship]]

return parent of p

<- relationship x: e.g. claim page, get supporting evidence

1

for block x

for every block r that contains a page reference to x

get parent block p of r

if block p contains -> and a page reference to relationship

return parent of s

get child block c of r

if block c contains <- and a page reference to relationship

return all children of s

Proposal for a knowledge graph plugin

Suggested Principles

Use the above queries to track relationships, and keep all the relationship data in the notes.

Be mainly a layer for surfacing, browsing, and querying the relationships that are built in the notes.

Keep the grammar non-restrictive and let users create relationships between any two notes by default. The plugin can still make it possible to create a more restrictive grammar as well, and you can see in this section how I suggest doing so: # How and when to add restrictive grammar

For one, a non-restrictive grammar allows users to not have to create prefix links in the front of every linked note like in Discourse Graph. This will be beneficial for use cases that want to connect a large volume of notes in their knowledge graph but don’t want to be restricted to a specific note naming system.

This also enables including blocks in the knowledge graph

This allows for more dynamic use of the knowledge graph with less “pre-planning”. This can have some downsides too, as it could create less clear ideas for what relationships mean if you’re using support to mean slightly different things in different contexts.

Downside, it would be harder to add inline CSS for links like Discourse Graph does

Implementation priorities

Make relationships visible on any page that has them.

From what I can tell plugins can’t integrate into pages yet, but I think ideally there would be a section above “Linked References” that shows “Relationships” and all the pages with relationships to this page grouped by relationship type (Supports ->, <- Supports, etc)

Create a query interface.

First basic single queries (all notes that support note x)

Supports chained queries

Return all blocks that support any blocks that have author)

Support AND and OR on pages (all notes that support block x OR block y)

Create a graph interface

See a visual graph of all the pages (and blocks) with their labeled connections

I believe this would be more valuable than the existing global graph

Option to change the syntax for identifying directionality

Users may not want to use ->

Users may prefer to use semantic complements like in Discourse Graph (supports and supported by)

This could be done by creating a special reserved relationship called complements, and allowing a user to create a complement as follows

- [[supports]]
	- [[complements]] ->
    	- [[supported by]]

and now the plugin can treat the following two aspects as equivalent

- [[some evidence]]
	- [[supports]] ->
    	- [[a claim]]
- [[a claim]]
	- [[supported by]] ->
    	- [[some evidence]]

Allow for blocks to be part of the knowledge graph like pages

Requires updating the query

Requires figuring out how to handle when a block contains text and page references, both the block and the page references will be considered part of the relationship? Not sure.

How and when to add restrictive grammar

1

Keep all the information encoded in the notes

I suggest continuing to use the existing simple syntax convention for relationships to add any grammar. This feels more in line with the outliner DNA of Logseq, as it uses the block hierarchies to encode information, and keeps everything visible in the markdown (rather than in an extra layer of data that only exists within the plugin).

How a user could add their own types

Use the existing syntax and create a new relationship (calling them whatever they want, here I use type)

- [[I looked up today and saw a blue sky]]
	- [[type]] ->
		- [[evidence]]

Side note, since any note can have a relationship, including relationships, you can create interesting layers…

- [[empirical]]
	- <- [[type]]
		- [[evidence]]
        - [[experiment]]

and we can go all sorts of places…

- [[type]]
	- [[type]] ->
		- [[relationship]]

A user might prefer to use a page-property here instead, since

Then to actually restrict relationships, the plugin could reserve a special relationship (e.g. grammar) for declaring the grammar of a relationship:

- [[supports]]
	- [[grammar]] ->
    	- ([[type]] -> [[evidence]]) -> ([[type]] -> [[claim]])

This would be parsed by the plugin as:

supports is a relationship from a block whose relationship type is evidence to a block whose relationship type is claim

If desired, the plugin could also automatically add the required relationship when it is missing

There are a lot of tricky details to this, and certainly some plugin-side caching will be required eventually since the queries could get messy, so I’m happy to discuss more.

If preferred, the plugin could allow the user to provide Logseq page and block properties to create a grammar. This could still be done in the notes like so:

- [[supports]]
	- [[grammar]] ->
    	- (tags:evidence) -> (tags:claim)

It’s possible there is never a need to actually enforce a restricted grammar inside of one’s personal knowledge graph. This is something to be considered.

When to restrict grammar

Provide more information when exporting one’s knowledge graph

In order to translate your knowledge graph to be integrated into someone else’s, the more detail around how your relationships work, the better.

By default, any block can have a relationship with any other block. But if more detail is added, then it will be easier to make sense of how the relationships in your knowledge graph match up with the relationships in my knowledge graph.

Future theoretical consideration: Ideally one would create a category out of their knowledge graph, and this way they a functor to translate their knowledge graph to another knowledge graph would be guaranteed to have proper composition and matched types.

See # Resources for more info

(Maybe) Guide the user if they’re trying to do something that isn’t supposed to work

if they’re trying to create a relationship they’ve declared as restrictive between unmatched block types, the plugin can warn them.

This can be tricky, so I would consider this a secondary priority

More notes

Relationships as functions

I think of these relationships as functions, and that will inform how some of you who know about functional programming may see where I’m coming from.

I also think about this knowledge graph as eventually being a category, but that’s beyond the scope of this spec.

Many to many

Relationships support one to one and one to many using the native block hierarchies (parent to child or parent to many children)

Could extend this by letting users put a list in the parent block like so

- [[some evidence 1]], [[some evidence 2]], [[some evidence 3]]
	- [[supports]] ->
    	- [[claim A]]
        - [[claim B]]

Plugin practicality

Some of these proposals might be impractical for a plugin to implement, as I’ve approached most of this thinking from a theoretical side more than a software practical side. However, assuming the queries can be made reasonably fast (either in place or through caching), I don’t think anything here is unfeasible.

Relationships vs properties

This has some similarities to the already richly supported properties in Logseq. I would suggest that making it possible to create rich relationships in line in one’s writing makes a huge difference (when I tried to use properties to do this, I ended up writing a lot of my notes in the properties, making my notes look more like a database table than an outliner or writing tool). I’m happy to discuss this more as I think the difference between this proposed functionality and properties is a subtle but important one.

This difference will be even more important if relationships are also supported between blocks and not just pages

World Knowledge Graph

1

I’m not sure how Logseq is planning to build their stated World Knowledge Graph, but this seems like a reasonable way to begin letting users add structure to their personal knowledge graphs that could eventually be integrated with other knowledge graphs.

If this were to become central to Logseq’s direction, it would make sense to store these relationships in the Logseq database. That would solve any latency concerns and create a very strong foundation for building tools to interface with your knowledge graph.