Telegram Web Link
Scalar functions name resolution special behavior in Snowflake.

I never get tired of repeating that writing a universal compiler for different SQL dialects is impossible.
Today, I will tell you about the behavior of name resolution in Snowflake inside CREATE VIEW.

When you execute queries, Snowflake looks for the objects specified in the query in the schemas specified in SEARCH_PATH. You can check them like this:

SELECT current_schemas();


But it seems like a good idea to make the creation of VIEWs independent of SEARCH_PATH. Otherwise, we will get different results when we work with VIEWs at different SEARCH_PATH.

The documentation says the following:
The SEARCH_PATH is not used inside views or UDFs. All unqualifed objects in a view or UDF definition will be resolved in the view’s or UDF’s schema only.


That's great! And it works!

CREATE OR REPLACE DATABASE db1;
CREATE SCHEMA sh1;

CREATE TABLE public.t1(c1 int);

CREATE VIEW sh1.v1 AS
SELECT * FROM t1;

will return
SQL compilation error:
Object 'DB1.SH1.T1' does not exist or not authorized.


And not only for tables. For any objects, except … scalar functions!

CREATE OR REPLACE DATABASE db1;
CREATE SCHEMA sh1;
CREATE TABLE sh1.t1(c1 int);
INSERT INTO sh1.t1(c1) VALUES (1);

CREATE FUNCTION public.test()
RETURNS NUMBER
LANGUAGE SQL
AS '1';

CREATE VIEW sh1.v1 AS
SELECT *, test() c2 FROM t1;

SELECT * FROM sh1.v1;

will return

C1 C2
1 1


Strange behavior, don't you agree?

In PostgreSQL, for example, it works like this: when creating a VIEW, all objects without schema specification are searched in the public schema. If you want a different schema, specify it by hand.

But maybe I'm being picky? Let's add one more thing…

CREATE FUNCTION sh1.test()
RETURNS NUMBER
LANGUAGE SQL
AS '2';

SELECT * FROM sh1.v1;

will return

C1 C2
1 2


Oops… If there is a scalar function in the schema where VIEW is created, it will be used. If not, the function from PUBLIC will be used.

I.e. if you didn't specify a schema for a scalar function from the PUBLIC schema while creating a VIEW, in a schema other than PUBLIC, then to corrupt the data in your database it is enough to create a function with the same name in the corresponding schema…

At dwh.dev, we know about a lot of these nuances. So you are unlikely to find a better data lineage for Snowflake 🙂

linkedin likes goto https://www.linkedin.com/feed/update/urn:li:activity:7132776122580152320/
1
I wrote an article about the Snowflake update and how startups are getting screwed by it, but my graphomania didn't fit into the damn LinkedIn, so I had to split it into three.

I won't torture you here either, so here's a link to the whole article:
https://dwh.dev/blog/startup-killers

and you can like it on LinkedIn here, here and here (please).

There's just an article from Alex on the same topic (forward next post).

"What are you going to do if Snowflake (Amazon, Google, name it) do the same thing?" - every investor's favorite question...
Forwarded from .и в продакшен (Alex Yumashev)
Сейчас будет скучный пост про стартапы и бизнес, Потому что в ChatGPT вчера добавили поддержку PDF файлов

(при чем тут это?)

При том, что десятки инди-стартапов вчера дружно построились и пошли умирать. Все эти pdf.ai, ChatPDF, AskYourPDF внезапно стали не нужны.

Прикольно, что все это очень смахивает на историю с Amazon Basics.

Вы же в курсе, как работает Амазон? Если они в аналитике видят выстреливший продукт от независимого поставщика - они быстро его клонируют и выпускают под брендом Amazon Basics. И впихивают свой клон в топ поиска. А потом наваливаются всей своей логистическо-ритейлово-маркетиноговой мощью. Оригинальный продукт через месяц умирает.

За это их много раз таскали по судам, антимонопольным разбирательствам и даже на слушания в Конгрессе. После чего они... просто научились лучше прятаться. Теперь копии продаются под "независимыми" фейковыми брендами.

Если вы делаете тупой враппер для GPT в горизонтальной нише - OpenAI убьет вас так же, как Amazon Basics убивает инди-ритейл. Я почти уверен, что они посматривают на свою API-аналитику и как только видят где-то "хоккейную клюшку" и легко реплицируемый продукт - добавляют его в виде фичи.

В английском бизнес-жаргоне пару лет назад стал популярен классный термин "moat". Буквальный перевод - "крепостной ров". Moat - это то, что защитит вас от конкурентов и platform-риска. Это ваш момент импульса (тот самый, который не дает гироскопу отклоняться от оси). Moat - это то, почему нельзя просто взять и скопировать бизнес-модель. Это не обязательно фича продукта - возможно это преданная аудитория. Или вы "go to" бренд в своей нише. А возможно ваш продукт плотно и незаметно интегрируется в критические бизнес-процессы клиента, и слезть с него тупо невозможно. Ну или вы - AWS и берете такие баблищи за траффик что переезжать очень больно.

Если у вас нет moat - бизнес под угрозой.

В общем, хватит пилить врапперы для ЖПТшки.

Пилите врапперы "with a twist". Ну, например, я не знаю, "PDF-чатилка для адвокатов да еще и совместимая с GDPR".

AI is a feature, not a product ☝️
👍15🤔1
Forwarded from DOFH - DevOps from hell
😁246
Lads, if you use Snowflake or other analytical databases, please like this post!

Maybe you remember, I used to make a service for static analysis of PostgreSQL (holistic.dev) and posted stuff about PG.

For quite some time now, I've been doing an observability platform for Snowflake (dwh.dev) with my team, and I'm fully immersed in this area.

And would love to understand how interested you guys are in this at all.
Is it worth sharing details about creating such software? Or is it necessary to bring back SQL-WTF about PostgreSQL?
In short, tell us what you would like to see.

PS: Yes, by the way, we have Snowflake chat rooms (RU/EN) and a channel with Snowflake news.
👍17👎1
There will be no singularity
Days without explanation JOINs via Venn diagrams by data influencers: 0 https://blog.jooq.org/say-no-to-venn-diagrams-when-explaining-joins/
A bit of explanation in my own words:

1) Venn diagrams describe interactions between sets. However, when we perform a JOIN, we don't get a single set but a set of sets (a table). Therefore, Venn diagrams aren't suitable for describing what actually happens after a JOIN.

2) In such illustrations (even if supplemented with tables), usually only one, almost the rarest case is depicted: unique values in both columns and an equality condition for the join. There are no repetitions, no NULLs, and no complex join conditions.

Previously, I created a static analyzer for PostgreSQL - holistic.dev. Besides type inference, I wanted to infer two more important parameters: constraints and the row count class. If we have information about unique indexes and foreign keys, we can infer the uniqueness of values, NULLability, and the row count class.

What are these classes? NONE, ONE, ONE_OR_NONE, MANY, MANY_OR_NONE.
Why are they needed? For instance, automatic asserts in code can be created. These exist in the Node.js driver. Or if you store your SQL in separate files, automatic checks can be made to ensure nothing broke during query or schema updates and to detect Row Explosion.

So, outputting these classes and constraints were the most complex parts of the entire compiler :)
All because of JOIN. Maybe that's why my eye still twitches when I see such explainers...

People might look at this and then receive bills for 50 million...

Conclusion: Venn diagrams are suitable for explaining UNION/INTERSECT/EXCEPT. For JOIN, this picture by Andreas Martinson is more suitable.
👍12🥰6🤯2
Channel name was changed to «There will be no singularity»
Don't be alarmed. It's time to rewrite the name of the channel in English 🙂
🤡16👏10😱4👍3🥴2🆒1
New week, new "special behavior" in Snowflake 🙂
In Snowflake, there is a special function IDENTIFIER() that allows referencing objects using strings and session variables:


SELECT
*
FROM
IDENTIFIER($my_table)


Documentation can be found here: https://docs.snowflake.com/en/sql-reference/identifier-literal

Additionally, you can reference an object in the column list:


SELECT
IDENTIFIER($my_table).*
FROM
IDENTIFIER($my_table)


Wait, that's not the "special behavior" yet...

1)

IDENTIFIER($my_table).*

works, but

IDENTIFIER($my_table).column_name

does not

This looks very strange... But until you try this:

2)

SELECT
IDENTIFIER($my_table1).*,
IDENTIFIER($my_table2).*
FROM
IDENTIFIER($my_table1),
IDENTIFIER($my_table2)

... and you'll get twice of all the columns 🙂

Because IDENTIFIER(whats_ever).* gets replaced by just *!

As a result, the above query is converted to


SELECT
*,
*
FROM
IDENTIFIER($my_table1),
IDENTIFIER($my_table2);


PS: send your reactions here 🙂
🤯5🤮4👍1🐳1🗿1
https://jepsen.io/analyses/mysql-8.0.34

Reason for not using MySQL #1234:
MySQL’s “Repeatable Read” does not satisfy PL-2.99 Repeatable Read
🔥9👍5🤯2👀2
2025/07/14 21:03:04
Back to Top
HTML Embed Code: