使用当前时间戳在 Apache Spark 中获取正确的时区偏移量

huangapple 未分类评论46阅读模式
英文:

Getting correct offset for timezone using current_timestamp in apache spark

问题

我对Java和Apache Spark都不太熟悉,正在尝试理解时间戳和时区的使用。
我想要将所有的时间戳从Apache Spark的数据中存储在SQL Server中,并使用EST时区。

当我使用current_timestamp时,我得到了正确的EST时间,但当我查看数据时,得到的偏移是'+00:00',而不是'-04:00'。

这是从Spark数据集传入数据库的一个值:
2020-04-07 11:36:23.0220 +00:00

从我看到的情况来看,current_timestamp不接受任何时区。而且,时间是正确的(它在EST时区),但我不明白为什么偏移量是错误的。

希望能帮助我理解这个问题。

英文:

I am new to both Java and Apache spark and trying to understand the timestamp and timezone usage.
I would like all the timestamps to be stored in EST timezone in SQL Server from data i get from apache spark DF.

When I use current_timestamp, I am getting the correct EST time but the offset i am getting when i look at data is '+00:00' instead of '-04:00'.

Here is a value stored in database that is passed in from spark dataset:
2020-04-07 11:36:23.0220 +00:00

From what I see current_timestamp does not accept any timezone. Moreover, the time is correct (it is in EST) but i don't understand why the offset is wrong.

Any help to understand this would be great.

答案1

得分: 0

Java的Timestamp在某种程度上与Java中的LocalDateTime类似 - 它们不包含时区信息。数据库将其解释为UTC时间戳。这就是为什么出现了不匹配的情况。通常我会使用两种方法(取决于哪种更合适):

  1. 您可以从Spark中返回UTC时间戳(使用简单的自定义UDF),而不是使用时区特定的current_timestamp
  2. 您可以将日期编码为字符串 - 类似地,使用java.time API,您可以使用简单的UDF实现此目的。

希望现在事情清楚一些了。

英文:

Java Timestamps work more or less as LocalDateTime in Java - they don't contain timezone information. And the database is interpreting this as UTC timestamp. That's why you got a mismatch. I usually use two approaches (depending what suits better)

  1. You can return UTC timestamp from Spark (with simple custom UDF) instead of using current_timestamp which is timezone specific.
  2. You can encode your dates as Strings - similarly, using java.time API you can achieve that with simple udf

Hope things are a bit clearer now.

答案2

得分: 0

我将UTC默认时间转换为本地时区,如下所示:

current_timestamp1 = current_timestamp(),
current_timestamp2 = from_utc_timestamp(current_timestamp, "Australia/Sydney")
英文:

I convert the UTC default to Localtimezone as below

current_timestamp1 = current_timestamp(),
current_timestamp2 = from_utc_timestamp(current_timestamp, "Australia/Sydney")

huangapple
  • 本文由 发表于 2020年4月8日 00:09:35
  • 转载请务必保留本文链接:https://java.coder-hub.com/61084442.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定